* [RFC PATCH 19/20] bpf: Document BPF_MAP_TYPE_LANDLOCK_RULESET
From: Justin Suess @ 2026-04-07 20:01 UTC (permalink / raw)
To: ast, daniel, andrii, kpsingh, paul, mic, viro, brauner, kees
Cc: gnoack, jack, jmorris, serge, song, yonghong.song, martin.lau, m,
eddyz87, john.fastabend, sdf, skhan, bpf, linux-security-module,
linux-kernel, linux-fsdevel, Justin Suess
In-Reply-To: <20260407200157.3874806-1-utilityemal77@gmail.com>
Document the BPF_MAP_TYPE_LANDLOCK_RULESET map type and explain the
kfuncs it is associated with.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Documentation/bpf/map_landlock_ruleset.rst | 181 +++++++++++++++++++++
1 file changed, 181 insertions(+)
create mode 100644 Documentation/bpf/map_landlock_ruleset.rst
diff --git a/Documentation/bpf/map_landlock_ruleset.rst b/Documentation/bpf/map_landlock_ruleset.rst
new file mode 100644
index 000000000000..90f3141a829b
--- /dev/null
+++ b/Documentation/bpf/map_landlock_ruleset.rst
@@ -0,0 +1,181 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+==============================
+BPF_MAP_TYPE_LANDLOCK_RULESET
+==============================
+
+``BPF_MAP_TYPE_LANDLOCK_RULESET`` is a specialized, array-backed map for
+holding references to Landlock rulesets that were created from userspace.
+It is meant to bridge BPF LSM policy selection with Landlock policy
+enforcement: userspace creates a normal Landlock ruleset, inserts its file
+descriptor into the map, and a BPF LSM program later looks up that ruleset and
+applies it with a Landlock kfunc during ``execve()`` preparation.
+
+BPF programs cannot create, inspect, or modify Landlock policy through this
+map. The looked-up object is exposed only as an opaque
+``struct bpf_landlock_ruleset`` reference.
+
+The map uses ``__u32`` keys as array indexes and stores one ruleset reference
+per slot. Like other array maps, its size is fixed at creation time and its
+elements are preallocated.
+
+Usage
+=====
+
+Kernel BPF
+----------
+
+.. note::
+ This map type is only supported for BPF LSM programs. In practice, it is
+ useful for sleepable BPF LSM programs attached to
+ ``bprm_creds_for_exec`` or ``bprm_creds_from_file``, because those are the
+ hooks where the associated Landlock kfuncs are available.
+
+bpf_map_lookup_elem()
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+ void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
+
+Lookup returns a trusted pointer to an opaque ``struct bpf_landlock_ruleset``.
+The verifier treats the result as a referenced BTF object, not as a pointer to
+the raw ``__u32`` map value declared in the map definition.
+
+Each successful lookup acquires a ruleset reference. The BPF program must
+release that reference with ``bpf_landlock_put_ruleset()`` on all paths after
+the lookup succeeds.
+
+The returned pointer is intended to be passed to
+``bpf_landlock_restrict_binprm()``. It is opaque and cannot be dereferenced
+or inspected from BPF.
+
+bpf_map_delete_elem()
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+ long bpf_map_delete_elem(struct bpf_map *map, const void *key)
+
+Delete removes the ruleset reference stored in the selected slot and drops the
+map's own reference to that ruleset.
+
+Landlock kfuncs
+---------------
+
+The map contains objects designed to work with the following Landlock kfuncs:
+
+.. code-block:: c
+
+ void bpf_landlock_put_ruleset(const struct bpf_landlock_ruleset *ruleset)
+
+.. code-block:: c
+
+ int bpf_landlock_restrict_binprm(struct linux_binprm *bprm,
+ const struct bpf_landlock_ruleset *ruleset,
+ __u32 flags)
+
+``bpf_landlock_restrict_binprm()`` applies the looked-up ruleset to the new
+program credentials that are being prepared for ``execve()``. The ``flags``
+argument uses the same Landlock restriction flags as
+``landlock_restrict_self()``, including ``LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS``.
+When this flag is used from BPF, ``no_new_privs`` is staged through the exec
+context and committed only after exec reaches point-of-no-return. This avoids
+side effects on failed executions or ``AT_EXECVE_CHECK`` while ensuring that
+the resulting task cannot gain more privileges through later exec transitions.
+
+Userspace
+---------
+
+bpf_map_update_elem()
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+ int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags)
+
+Userspace populates the map by writing a Landlock ruleset file descriptor into
+the selected slot. The map uses FD-array update semantics:
+
+- ``key`` points to a ``__u32`` array index.
+- ``value`` points to a ``__u32`` containing the ruleset file descriptor.
+- ``flags`` must be ``BPF_ANY``.
+
+The supplied file descriptor must refer to a valid Landlock ruleset.
+
+Userspace lookup of map contents is not supported for this map type.
+
+Example
+=======
+
+Kernel BPF
+----------
+
+The following snippet shows a sleepable BPF LSM program that looks up a
+ruleset, applies it during exec credential preparation, and releases the
+lookup reference.
+
+.. code-block:: c
+
+ struct {
+ __uint(type, BPF_MAP_TYPE_LANDLOCK_RULESET);
+ __uint(max_entries, 1);
+ __type(key, __u32);
+ __type(value, __u32);
+ } ruleset_map SEC(".maps");
+
+ SEC("lsm.s/bprm_creds_for_exec")
+ int BPF_PROG(apply_ruleset, struct linux_binprm *bprm)
+ {
+ const struct bpf_landlock_ruleset *ruleset;
+ __u32 key = 0;
+
+ ruleset = bpf_map_lookup_elem(&ruleset_map, &key);
+ if (!ruleset)
+ return 0;
+
+ bpf_landlock_restrict_binprm(
+ bprm, ruleset, LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS);
+ bpf_landlock_put_ruleset(ruleset);
+ return 0;
+ }
+
+Userspace
+---------
+
+The following snippet shows how to insert a previously created Landlock
+ruleset into the map.
+
+.. code-block:: c
+
+ int populate_ruleset_map(int map_fd, int ruleset_fd)
+ {
+ __u32 key = 0;
+ __u32 value = ruleset_fd;
+
+ return bpf_map_update_elem(map_fd, &key, &value, BPF_ANY);
+ }
+
+Semantics
+=========
+
+- Map creation requires ``CONFIG_SECURITY_LANDLOCK``. Otherwise,
+ ``BPF_MAP_CREATE`` for this type fails with ``-EOPNOTSUPP``.
+- Map definitions use ``sizeof(__u32)`` for both keys and values because
+ userspace writes ruleset file descriptors into the map.
+- From BPF, only ``bpf_map_lookup_elem()`` and ``bpf_map_delete_elem()`` are
+ supported for this map type.
+- From userspace, insertion is done with ``bpf_map_update_elem()`` using a
+ Landlock ruleset FD.
+- The looked-up value is an opaque, trusted BTF object reference, so BPF must
+ treat it as a handle and release it with ``bpf_landlock_put_ruleset()``.
+- ``LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS`` on the BPF path pins the resulting
+ task with ``no_new_privs`` after exec is committed. When used from
+ ``bprm_creds_from_file``, this does not retroactively suppress privilege gain
+ for the current exec transition itself.
+- If Landlock support is disabled in the running kernel, programs using the
+ associated Landlock kfuncs may still load, but the kfunc call returns
+ ``-EOPNOTSUPP`` at runtime.
+
+See ``tools/testing/selftests/bpf/progs/landlock_kfuncs.c`` for a complete
+example.
--
2.53.0
^ permalink raw reply related
* [RFC PATCH 20/20] MAINTAINERS: update entry for the Landlock subsystem
From: Justin Suess @ 2026-04-07 20:01 UTC (permalink / raw)
To: ast, daniel, andrii, kpsingh, paul, mic, viro, brauner, kees
Cc: gnoack, jack, jmorris, serge, song, yonghong.song, martin.lau, m,
eddyz87, john.fastabend, sdf, skhan, bpf, linux-security-module,
linux-kernel, linux-fsdevel, Justin Suess
In-Reply-To: <20260407200157.3874806-1-utilityemal77@gmail.com>
Update the maintainers file to reflect the new selftest files,
cross-subsystem, documentation, and kernel-internal Landlock headers.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
MAINTAINERS | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index c3fe46d7c4bc..e9ad2ed1237a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14386,12 +14386,16 @@ S: Supported
W: https://landlock.io
T: git https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git
F: Documentation/admin-guide/LSM/landlock.rst
+F: Documentation/bpf/map_landlock_ruleset.rst
F: Documentation/security/landlock.rst
F: Documentation/userspace-api/landlock.rst
F: fs/ioctl.c
+F: include/linux/landlock.h
F: include/uapi/linux/landlock.h
F: samples/landlock/
F: security/landlock/
+F: tools/testing/selftests/bpf/prog_tests/landlock_kfuncs.c
+F: tools/testing/selftests/bpf/progs/landlock_kfuncs.c
F: tools/testing/selftests/landlock/
K: landlock
K: LANDLOCK
--
2.53.0
^ permalink raw reply related
* Re: [PATCH v4 3/3] selinux: fix overlayfs mmap() and mprotect() access checks
From: Paul Moore @ 2026-04-07 20:21 UTC (permalink / raw)
To: Stephen Smalley
Cc: Ondrej Mosnacek, linux-security-module, selinux, linux-fsdevel,
linux-unionfs, linux-erofs, Amir Goldstein, Gao Xiang,
Christian Brauner
In-Reply-To: <CAEjxPJ62=0v9QYJ6s0DrwRp4WZna8f9wnuM_DUUNrcz2dd_kog@mail.gmail.com>
On Tue, Apr 7, 2026 at 3:20 PM Stephen Smalley
<stephen.smalley.work@gmail.com> wrote:
> On Tue, Apr 7, 2026 at 10:35 AM Paul Moore <paul@paul-moore.com> wrote:
> > On Tue, Apr 7, 2026 at 8:14 AM Stephen Smalley
> > <stephen.smalley.work@gmail.com> wrote:
> > > On Thu, Apr 2, 2026 at 11:09 PM Paul Moore <paul@paul-moore.com> wrote:
> > > >
> > > > The existing SELinux security model for overlayfs is to allow access if
> > > > the current task is able to access the top level file (the "user" file)
> > > > and the mounter's credentials are sufficient to access the lower
> > > > level file (the "backing" file). Unfortunately, the current code does
> > > > not properly enforce these access controls for both mmap() and mprotect()
> > > > operations on overlayfs filesystems.
> > > >
> > > > This patch makes use of the newly created security_mmap_backing_file()
> > > > LSM hook to provide the missing backing file enforcement for mmap()
> > > > operations, and leverages the backing file API and new LSM blob to
> > > > provide the necessary information to properly enforce the mprotect()
> > > > access controls.
> > > >
> > > > Cc: stable@vger.kernel.org
> > > > Signed-off-by: Paul Moore <paul@paul-moore.com>
> > >
> > > Do you have tests for these changes showing the before and after (i.e.
> > > failing without your patches, passing with them)? I tried running an
> > > earlier set from Ondrej but they failed.
> >
> > A few months ago I sent you and Ondrej some feedback on those early
> > tests from Ondrej, but yes, I also had problems with Ondrej's tests.
> > I've been using a hacked up combination of the existing tests, some of
> > Ondrej's additions, and an additional debug/test patch to ensure the
> > labeling is correct. It's far from ideal, but I didn't invest time in
> > test development as I assumed Ondrej would continue his efforts there
> > (unfortunately it doesn't appear that he has?), and I wanted to focus
> > on getting a solution as soon as possible for obvious reasons.
>
> Ok, I'm happy to look at even unpolished tests - just want something I
> can use to exercise the before and after states.
Hopefully Ondrej can provide an updated patch.
--
paul-moore.com
^ permalink raw reply
* Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets
From: Ihor Solodrai @ 2026-04-08 4:40 UTC (permalink / raw)
To: Justin Suess, ast, daniel, andrii, kpsingh, paul, mic, viro,
brauner, kees
Cc: gnoack, jack, jmorris, serge, song, yonghong.song, martin.lau, m,
eddyz87, john.fastabend, sdf, skhan, bpf, linux-security-module,
linux-kernel, linux-fsdevel
In-Reply-To: <20260407200157.3874806-1-utilityemal77@gmail.com>
On 4/7/26 1:01 PM, Justin Suess wrote:
> Hello,
>
> This series lets sleepable BPF LSM programs apply an existing,
> userspace-created Landlock ruleset to a program during exec.
>
> The goal is not to move Landlock policy definition into BPF, nor to create a
> second policy engine. Instead, BPF is used only to select when an already
> valid Landlock ruleset should be applied, based on runtime exec context.
>
> Background
> ===
>
> Landlock is primarily a syscall-driven, unprivileged-first LSM. That model
> works well when the application being sandboxed can create and enforce its own
> rulesets, or when a trusted launcher can impose restrictions directly before
> running a trusted target.
>
> That becomes harder when the target program is not under first-party control,
> for example:
>
> 1. third-party binaries,
> 2. unmodified container images,
> 3. programs reached through shells, wrappers, or service managers, and
> 4. user-supplied or otherwise untrusted code.
>
> In these cases, an external supervisor may want to apply a Landlock ruleset to
> the final executed program, while leaving unrelated parents or helper
> processes alone.
>
> Why external sandboxing is awkward today
> ===
>
> There are two recurring problems.
>
> First, userspace cannot reliably predict every file a target may need across
> different systems, packaging layouts, and runtime conditions. Shared
> libraries, configuration files, interpreters, and helper binaries often depend
> on details that are only known at runtime.
>
> Second, Landlock inheritance is intentionally one-way. Once a task is
> restricted, descendants inherit that domain and may only become more
> restricted. This is exactly what Landlock should do, but it makes external
> sandboxing awkward when the program of interest is buried inside a larger exec
> chain. Applying restrictions too early can affect unrelated intermediates;
> applying them too late misses the target entirely.
>
> This series addresses that target-selection problem.
>
> Overview
> ===
>
> This series adds a small BPF-to-Landlock bridge:
>
> 1. userspace creates a normal Landlock ruleset through the existing ABI;
> 2. userspace inserts that ruleset FD into a new
> BPF_MAP_TYPE_LANDLOCK_RULESET map;
> 3. a sleepable BPF LSM program attached to an exec-time hook looks up the
> ruleset; and
> 4. the program calls a kfunc to apply that ruleset to the new program's
> credentials before exec completes.
>
> The important point is that BPF does not create, inspect, or mutate Landlock
> policy here. It only decides whether to apply a ruleset that was already
> created and validated through Landlock's existing userspace API.
>
> Interface
> ===
>
> The series adds:
>
> 1. bpf_landlock_restrict_binprm(), which applies a referenced ruleset to
> struct linux_binprm credentials;
> 2. bpf_landlock_put_ruleset(), which releases a referenced ruleset; and
> 3. BPF_MAP_TYPE_LANDLOCK_RULESET, a specialized map type for holding
> references to Landlock rulesets originating from userspace file
> descriptors.
> 4. A new field in the linux_binprm struct to enable application of
> task_set_no_new_privs once execution is beyond the point of no return.
>
> The kfuncs are restricted to sleepable BPF LSM programs attached to
> bprm_creds_for_exec and bprm_creds_from_file, which are the points where the
> new program's credentials may still be updated safely.
>
> This series also adds LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS. On the BPF path,
> this is staged through the exec context and committed only after exec reaches
> point-of-no-return. This avoids side effects on failed executions while
> ensuring that the resulting task cannot gain more privileges through later exec
> transitions. This is done through the set_nnp_on_point_of_no_return field.
>
> This has a little subtlety: LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS in the BPF
> path will not stop the current execution from escalating at all; only subsequent
> ones. This is intentional to allow landlock policies to be applied through a
> setuid transition for instance, without affecting the current escalation.
>
> Semantics
> ===
>
> This proposal is intended to preserve Landlock semantics as much as practical
> for an exec-time BPF attachment model:
>
> 1. only pre-existing Landlock rulesets may be applied;
> 2. BPF cannot construct, inspect, or modify rulesets;
> 3. enforcement still happens before the new program begins execution;
> 4. normal Landlock inheritance, layering, and future composition remain
> unchanged; and
> 5. this does not bypass Landlock's privilege checks for applying Landlock
> rulesets.
>
> In other words, BPF acts as an external selector for when to apply Landlock,
> not as a replacement for Landlock's enforcement engine.
>
> All behavior, future access rights, and previous access rights are designed
> to automatically be supported from either BPF or existing syscall contexts.
>
> The main semantic difference is LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS on the BPF
> path: it guarantees that the resulting task is pinned with no_new_privs before
> it can perform later exec transitions, but it does not retroactively suppress
> privilege gain for the current exec transition itself.
>
> The other exception to semantics is the LANDLOCK_RESTRICT_SELF_TSYNC flag.
> (see Points of Feedback section)
>
> Patch layout
> ===
>
> Patches 1-5 prepare the Landlock side by moving shared ruleset logic out of
> syscalls.c, adding a no_new_privs flag for non-syscall callers, exposing
> linux_binprm->set_nnp_on_point_of_no_return as an interface to set no_new_privs
> on the point of no return, and making deferred ruleset destruction RCU-safe.
>
> Patches 6-10 add the BPF-facing pieces: the Landlock kfuncs, the new map type,
> syscall handling for that map, and verifier support.
>
> Patches 11-15 add selftests and the small bpftool update needed for the new
> map type.
>
> Patches 16-20 add docs and bump the ABI version and update MAINTAINERS.
>
> Feedback is especially welcome on the overall interface shape, the choice of
> hooks, and the map semantics.
>
> Testing
> ===
>
> This patch series has two portions of tests.
>
> One lives in the traditional Landlock selftests, for the new
> LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS flag.
>
> The other suite lives under the BPF selftests, and this tests the Landlock
> kfuncs and the new BPF_MAP_TYPE_LANDLOCK_RULESET.
>
> This patch series was run through BPF CI, the results of which are here. [1]
>
> All mentioned tests are passing, as well as the BPF CI.
>
> [1] : https://github.com/kernel-patches/bpf/pull/11562
Hello Justin.
I regret to disappoint you with a lame piece of feedback, but the
series hasn't been picked up by automated BPF CI pipeline properly:
https://github.com/kernel-patches/bpf/pull/11709
I suggest you rebase on top of bpf-next/master [1], and re-submit to
the mailing list with a bpf-next tag in subject:
"[RFC PATCH bpf-next ...] bpf: ..."
I'm pretty sure AI bot will find something annoying to address.
Other than that, please be patient. It'll probably take a while for
maintainers and reviewers to digest this work before anyone can
meaningfully comment. Thanks!
[1] https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
>
> Points of Feedback
> ===
>
> First, the new set_nnp_on_point_of_no_return field in struct linux_binprm.
> This field was needed to request that task_set_no_new_privs be set during an
> execution, but only after the execution has proceeded beyond the point of no
> return. I couldn't find a way to express this semantic without adding a new
> bitfield to struct linux_binprm and a conditional in fs/exec.c. Please see
> patch 2.
>
> Feedback on the BPF testing harness, which was generated with AI assistance as
> disclosed in the commit footer, is welcomed. I have only limited familiarity
> with BPF testing practices. These tests were made with strong human supervision.
> See patches 14 and 15.
>
> Feedback on the NO_NEW_PRIVS situation is also welcomed. Because task_set_no_new_privs()
> would otherwise leak state on failed executions or AT_EXECVE_CHECK, this series
> stages no_new_privs through the exec context and only commits it after
> point-of-no-return. This preserves failure behavior while still ensuring that
> the resulting task cannot elevate further through later exec transitions.
> When called from bprm_creds_from_file, this does not retroactively change the
> privilege outcome of the current exec transition itself.
>
> See patch 2 and 3.
>
> Next, the RCU in the landlock_ruleset. Existing BPF maps use RCU to make sure maps
> holding references stay valid. I altered the landlock ruleset to use rcu_work
> to make sure that the rcu is synchronized before putting on a ruleset, and
> acquire the rcu in the arraymap implementation. See patches 5-10.
>
> Next, the semantics of the map. What operations should be supported from BPF
> and userspace and what data types should they return? I consider the struct
> bpf_landlock_ruleset to be opaque. Userspace can add items to the map via the
> fd, delete items by their index, and BPF can delete and lookup items by their
> index. Items cannot be updated, only swapped.
>
> Finally, the handling of the LANDLOCK_RESTRICT_SELF_TSYNC flag. This flag has
> no meaning in a pre-execution context, as the credentials during the designated
> LSM hooks (bprm_creds_for_exec/creds_from_file) still represent the pre-execution
> task. Therefore, this flag is invalidated and attempting to use it with
> bpf_landlock_restrict_binprm will return -EINVAL. Otherwise, the flag would
> result in applying the landlock ruleset to the wrong target in addition to the
> intended one. (see patch 2). This behavior is validated with selftests.
>
> Existing works / Credits
> ===
>
> Mickaël Salaün created patchsets adding BPF tracepoints for landlock in [2] [3].
>
> Mickaël also gave feedback on this feature and the idea in this GitHub thread. [4]
>
> Günther Noack initially received and provided initial feedback on this idea as
> an early prototype.
>
> Liz Rice, author of "Learning eBPF: Programming the Linux Kernel for Enhanced
> Observability, Networking, and Security" provided background and inspired me to
> experiment with BPF and the BPF LSM. [5]
>
> [2] : https://lore.kernel.org/all/20250523165741.693976-1-mic@digikod.net/
> [3] : https://lore.kernel.org/linux-security-module/20260406143717.1815792-1-mic@digikod.net/
> [4] : https://github.com/landlock-lsm/linux/issues/56
> [5] : https://wellesleybooks.com/book/9781098135126
>
> Kind Regards,
> Justin Suess
>
> Justin Suess (20):
> landlock: Move operations from syscall into ruleset code
> execve: Add set_nnp_on_point_of_no_return
> landlock: Implement LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> selftests/landlock: Cover LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> landlock: Make ruleset deferred free RCU safe
> bpf: lsm: Add Landlock kfuncs
> bpf: arraymap: Implement Landlock ruleset map
> bpf: Add Landlock ruleset map type
> bpf: syscall: Handle Landlock ruleset maps
> bpf: verifier: Add Landlock ruleset map support
> selftests/bpf: Add Landlock kfunc declarations
> selftests/landlock: Rename gettid wrapper for BPF reuse
> selftests/bpf: Enable Landlock in selftests kernel.
> selftests/bpf: Add Landlock kfunc test program
> selftests/bpf: Add Landlock kfunc test runner
> landlock: Bump ABI version
> tools: bpftool: Add documentation for landlock_ruleset
> landlock: Document LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> bpf: Document BPF_MAP_TYPE_LANDLOCK_RULESET
> MAINTAINERS: update entry for the Landlock subsystem
>
> Documentation/bpf/map_landlock_ruleset.rst | 181 +++++
> Documentation/userspace-api/landlock.rst | 22 +-
> MAINTAINERS | 4 +
> fs/exec.c | 8 +
> include/linux/binfmts.h | 7 +-
> include/linux/bpf_lsm.h | 15 +
> include/linux/bpf_types.h | 1 +
> include/linux/landlock.h | 92 +++
> include/uapi/linux/bpf.h | 1 +
> include/uapi/linux/landlock.h | 14 +
> kernel/bpf/arraymap.c | 67 ++
> kernel/bpf/bpf_lsm.c | 145 ++++
> kernel/bpf/syscall.c | 4 +-
> kernel/bpf/verifier.c | 15 +-
> samples/landlock/sandboxer.c | 7 +-
> security/landlock/limits.h | 2 +-
> security/landlock/ruleset.c | 198 ++++-
> security/landlock/ruleset.h | 25 +-
> security/landlock/syscalls.c | 158 +---
> .../bpf/bpftool/Documentation/bpftool-map.rst | 2 +-
> tools/bpf/bpftool/map.c | 2 +-
> tools/include/uapi/linux/bpf.h | 1 +
> tools/lib/bpf/libbpf.c | 1 +
> tools/lib/bpf/libbpf_probes.c | 6 +
> tools/testing/selftests/bpf/bpf_kfuncs.h | 20 +
> tools/testing/selftests/bpf/config | 5 +
> tools/testing/selftests/bpf/config.x86_64 | 1 -
> .../bpf/prog_tests/landlock_kfuncs.c | 733 ++++++++++++++++++
> .../selftests/bpf/progs/landlock_kfuncs.c | 92 +++
> tools/testing/selftests/landlock/base_test.c | 10 +-
> tools/testing/selftests/landlock/common.h | 28 +-
> tools/testing/selftests/landlock/fs_test.c | 103 +--
> tools/testing/selftests/landlock/net_test.c | 55 +-
> .../testing/selftests/landlock/ptrace_test.c | 14 +-
> .../landlock/scoped_abstract_unix_test.c | 51 +-
> .../selftests/landlock/scoped_base_variants.h | 23 +
> .../selftests/landlock/scoped_common.h | 5 +-
> .../selftests/landlock/scoped_signal_test.c | 30 +-
> tools/testing/selftests/landlock/wrappers.h | 2 +-
> 39 files changed, 1877 insertions(+), 273 deletions(-)
> create mode 100644 Documentation/bpf/map_landlock_ruleset.rst
> create mode 100644 include/linux/landlock.h
> create mode 100644 tools/testing/selftests/bpf/prog_tests/landlock_kfuncs.c
> create mode 100644 tools/testing/selftests/bpf/progs/landlock_kfuncs.c
>
>
> base-commit: 8c6a27e02bc55ab110d1828610048b19f903aaec
^ permalink raw reply
* Re: [PATCH v2] KEYS: trusted: Debugging as a feature
From: Jarkko Sakkinen @ 2026-04-08 8:24 UTC (permalink / raw)
To: Srish Srinivasan
Cc: linux-integrity, keyrings, Nayna Jain, James Bottomley,
Mimi Zohar, David Howells, Paul Moore, James Morris,
Serge E. Hallyn, Ahmad Fatoum, Pengutronix Kernel Team, open list,
open list:SECURITY SUBSYSTEM
In-Reply-To: <0ce8d850-9ca7-4327-a6be-d1cb84925915@linux.ibm.com>
On Thu, Mar 26, 2026 at 10:34:58PM +0530, Srish Srinivasan wrote:
>
> On 3/24/26 4:30 PM, Jarkko Sakkinen wrote:
> > TPM_DEBUG, and other similar flags, are a non-standard way to specify a
> > feature in Linux kernel. Introduce CONFIG_TRUSTED_KEYS_DEBUG for
> > trusted keys, and use it to replace these ad-hoc feature flags.
> >
> > Given that trusted keys debug dumps can contain sensitive data, harden
> > the feature as follows:
> >
> > 1. In the Kconfig description postulate that pr_debug() statements must be
> > used.
> > 2. Use pr_debug() statements in TPM 1.x driver to print the protocol dump.
> >
> > Traces, when actually needed, can be easily enabled by providing
> > trusted.dyndbg='+p' in the kernel command-line.
> >
> > Cc: Srish Srinivasan <ssrish@linux.ibm.com>
> > Reported-by: Nayna Jain <nayna@linux.ibm.com>
> > Closes: https://lore.kernel.org/all/7f8b8478-5cd8-4d97-bfd0-341fd5cf10f9@linux.ibm.com/
> > Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
>
>
> Tested on PKWM and emulated TPM backends.
>
> Tested-by: Srish Srinivasan <ssrish@linux.ibm.com>
Thank you!
As it is uprised as a feature (like it should as ad-hoc compilation
flags are harmful), this also requires a boot flag so that "I know
what I'm doing" is addressed.
I'll send one more round with a flag 'trusted.debug=0|1'. These extra
steps protect production kernels for reasonable measure.
BR, Jarkko
^ permalink raw reply
* Re: [PATCH v2] KEYS: trusted: Debugging as a feature
From: Jarkko Sakkinen @ 2026-04-08 8:26 UTC (permalink / raw)
To: Nayna Jain
Cc: linux-integrity, keyrings, Srish Srinivasan, James Bottomley,
Mimi Zohar, David Howells, Paul Moore, James Morris,
Serge E. Hallyn, Ahmad Fatoum, Pengutronix Kernel Team, open list,
open list:SECURITY SUBSYSTEM
In-Reply-To: <afc489d2-a62f-4604-8e56-219311b46516@linux.ibm.com>
On Mon, Apr 06, 2026 at 10:42:00PM -0400, Nayna Jain wrote:
>
> On 3/24/26 7:00 AM, Jarkko Sakkinen wrote:
> > TPM_DEBUG, and other similar flags, are a non-standard way to specify a
> > feature in Linux kernel. Introduce CONFIG_TRUSTED_KEYS_DEBUG for
> > trusted keys, and use it to replace these ad-hoc feature flags.
> >
> > Given that trusted keys debug dumps can contain sensitive data, harden
> > the feature as follows:
> >
> > 1. In the Kconfig description postulate that pr_debug() statements must be
> > used.
> > 2. Use pr_debug() statements in TPM 1.x driver to print the protocol dump.
> >
> > Traces, when actually needed, can be easily enabled by providing
> > trusted.dyndbg='+p' in the kernel command-line.
> >
> > Cc: Srish Srinivasan <ssrish@linux.ibm.com>
> > Reported-by: Nayna Jain <nayna@linux.ibm.com>
> > Closes: https://lore.kernel.org/all/7f8b8478-5cd8-4d97-bfd0-341fd5cf10f9@linux.ibm.com/
> > Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
> > ---
> > v2:
> > - Implement for all trusted keys backends.
> > - Add HAVE_TRUSTED_KEYS_DEBUG as it is a good practice despite full
> > coverage.
> > ---
> > include/keys/trusted-type.h | 18 +++++-------
> > security/keys/trusted-keys/Kconfig | 19 ++++++++++++
> > security/keys/trusted-keys/trusted_caam.c | 4 +--
> > security/keys/trusted-keys/trusted_tpm1.c | 36 +++++++++++------------
> > 4 files changed, 46 insertions(+), 31 deletions(-)
> >
> > diff --git a/include/keys/trusted-type.h b/include/keys/trusted-type.h
> > index 03527162613f..620a1f890b6b 100644
> > --- a/include/keys/trusted-type.h
> > +++ b/include/keys/trusted-type.h
> > @@ -83,18 +83,16 @@ struct trusted_key_source {
> > extern struct key_type key_type_trusted;
> > -#define TRUSTED_DEBUG 0
> > -
> > -#if TRUSTED_DEBUG
> > +#ifdef CONFIG_TRUSTED_KEYS_DEBUG
> > static inline void dump_payload(struct trusted_key_payload *p)
> > {
> > - pr_info("key_len %d\n", p->key_len);
> > - print_hex_dump(KERN_INFO, "key ", DUMP_PREFIX_NONE,
> > - 16, 1, p->key, p->key_len, 0);
> > - pr_info("bloblen %d\n", p->blob_len);
> > - print_hex_dump(KERN_INFO, "blob ", DUMP_PREFIX_NONE,
> > - 16, 1, p->blob, p->blob_len, 0);
> > - pr_info("migratable %d\n", p->migratable);
> > + pr_debug("key_len %d\n", p->key_len);
> > + print_hex_dump_debug("key ", DUMP_PREFIX_NONE,
> > + 16, 1, p->key, p->key_len, 0);
> > + pr_debug("bloblen %d\n", p->blob_len);
> > + print_hex_dump_debug("blob ", DUMP_PREFIX_NONE,
> > + 16, 1, p->blob, p->blob_len, 0);
> > + pr_debug("migratable %d\n", p->migratable);
> > }
> > #else
> > static inline void dump_payload(struct trusted_key_payload *p)
> > diff --git a/security/keys/trusted-keys/Kconfig b/security/keys/trusted-keys/Kconfig
> > index 9e00482d886a..2ad9ba0e03f1 100644
> > --- a/security/keys/trusted-keys/Kconfig
> > +++ b/security/keys/trusted-keys/Kconfig
> > @@ -1,10 +1,25 @@
> > config HAVE_TRUSTED_KEYS
> > bool
> > +config HAVE_TRUSTED_KEYS_DEBUG
> > + bool
> > +
> > +config TRUSTED_KEYS_DEBUG
> > + bool "Debug trusted keys"
> > + depends on HAVE_TRUSTED_KEYS_DEBUG
> > + default n
> > + help
> > + Trusted keys backends and core code that support debug dumps
> > + can opt-in that feature here. Dumps must only use DEBUG
> > + level output, as sensitive data may pass by. In the
> > + kernel-command line traces can be enabled via
> > + trusted.dyndbg='+p'.
>
> Would it be good idea to add an explicit note/warning:
>
>
> NOTE: This option is intended for debugging purposes only. Do not enable on
> production systems as debug output may expose sensitive cryptographic
> material.
> If you are unsure, say N.
>
> Apart from this, looks good to me.
>
> Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Thank, I'll add your tag but would you mind quickly screening v3 again
where I add "trusted.debug=0|1". And yes, your suggestion about extra
warning makes sense.
Let's make this safe as possible. Mistakes do happen... and then those
measures pay off :-)
BR, Jarkko
^ permalink raw reply
* Re: [PATCH v2] KEYS: trusted: Debugging as a feature
From: Jarkko Sakkinen @ 2026-04-08 8:29 UTC (permalink / raw)
To: Nayna Jain
Cc: linux-integrity, keyrings, Srish Srinivasan, James Bottomley,
Mimi Zohar, David Howells, Paul Moore, James Morris,
Serge E. Hallyn, Ahmad Fatoum, Pengutronix Kernel Team, open list,
open list:SECURITY SUBSYSTEM
In-Reply-To: <adYRURAJfNCu0FYB@kernel.org>
On Wed, Apr 08, 2026 at 11:27:01AM +0300, Jarkko Sakkinen wrote:
> On Mon, Apr 06, 2026 at 10:42:00PM -0400, Nayna Jain wrote:
> >
> > On 3/24/26 7:00 AM, Jarkko Sakkinen wrote:
> > > TPM_DEBUG, and other similar flags, are a non-standard way to specify a
> > > feature in Linux kernel. Introduce CONFIG_TRUSTED_KEYS_DEBUG for
> > > trusted keys, and use it to replace these ad-hoc feature flags.
> > >
> > > Given that trusted keys debug dumps can contain sensitive data, harden
> > > the feature as follows:
> > >
> > > 1. In the Kconfig description postulate that pr_debug() statements must be
> > > used.
> > > 2. Use pr_debug() statements in TPM 1.x driver to print the protocol dump.
> > >
> > > Traces, when actually needed, can be easily enabled by providing
> > > trusted.dyndbg='+p' in the kernel command-line.
> > >
> > > Cc: Srish Srinivasan <ssrish@linux.ibm.com>
> > > Reported-by: Nayna Jain <nayna@linux.ibm.com>
> > > Closes: https://lore.kernel.org/all/7f8b8478-5cd8-4d97-bfd0-341fd5cf10f9@linux.ibm.com/
> > > Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
> > > ---
> > > v2:
> > > - Implement for all trusted keys backends.
> > > - Add HAVE_TRUSTED_KEYS_DEBUG as it is a good practice despite full
> > > coverage.
> > > ---
> > > include/keys/trusted-type.h | 18 +++++-------
> > > security/keys/trusted-keys/Kconfig | 19 ++++++++++++
> > > security/keys/trusted-keys/trusted_caam.c | 4 +--
> > > security/keys/trusted-keys/trusted_tpm1.c | 36 +++++++++++------------
> > > 4 files changed, 46 insertions(+), 31 deletions(-)
> > >
> > > diff --git a/include/keys/trusted-type.h b/include/keys/trusted-type.h
> > > index 03527162613f..620a1f890b6b 100644
> > > --- a/include/keys/trusted-type.h
> > > +++ b/include/keys/trusted-type.h
> > > @@ -83,18 +83,16 @@ struct trusted_key_source {
> > > extern struct key_type key_type_trusted;
> > > -#define TRUSTED_DEBUG 0
> > > -
> > > -#if TRUSTED_DEBUG
> > > +#ifdef CONFIG_TRUSTED_KEYS_DEBUG
> > > static inline void dump_payload(struct trusted_key_payload *p)
> > > {
> > > - pr_info("key_len %d\n", p->key_len);
> > > - print_hex_dump(KERN_INFO, "key ", DUMP_PREFIX_NONE,
> > > - 16, 1, p->key, p->key_len, 0);
> > > - pr_info("bloblen %d\n", p->blob_len);
> > > - print_hex_dump(KERN_INFO, "blob ", DUMP_PREFIX_NONE,
> > > - 16, 1, p->blob, p->blob_len, 0);
> > > - pr_info("migratable %d\n", p->migratable);
> > > + pr_debug("key_len %d\n", p->key_len);
> > > + print_hex_dump_debug("key ", DUMP_PREFIX_NONE,
> > > + 16, 1, p->key, p->key_len, 0);
> > > + pr_debug("bloblen %d\n", p->blob_len);
> > > + print_hex_dump_debug("blob ", DUMP_PREFIX_NONE,
> > > + 16, 1, p->blob, p->blob_len, 0);
> > > + pr_debug("migratable %d\n", p->migratable);
> > > }
> > > #else
> > > static inline void dump_payload(struct trusted_key_payload *p)
> > > diff --git a/security/keys/trusted-keys/Kconfig b/security/keys/trusted-keys/Kconfig
> > > index 9e00482d886a..2ad9ba0e03f1 100644
> > > --- a/security/keys/trusted-keys/Kconfig
> > > +++ b/security/keys/trusted-keys/Kconfig
> > > @@ -1,10 +1,25 @@
> > > config HAVE_TRUSTED_KEYS
> > > bool
> > > +config HAVE_TRUSTED_KEYS_DEBUG
> > > + bool
> > > +
> > > +config TRUSTED_KEYS_DEBUG
> > > + bool "Debug trusted keys"
> > > + depends on HAVE_TRUSTED_KEYS_DEBUG
> > > + default n
> > > + help
> > > + Trusted keys backends and core code that support debug dumps
> > > + can opt-in that feature here. Dumps must only use DEBUG
> > > + level output, as sensitive data may pass by. In the
> > > + kernel-command line traces can be enabled via
> > > + trusted.dyndbg='+p'.
> >
> > Would it be good idea to add an explicit note/warning:
> >
> >
> > NOTE: This option is intended for debugging purposes only. Do not enable on
> > production systems as debug output may expose sensitive cryptographic
> > material.
> > If you are unsure, say N.
> >
> > Apart from this, looks good to me.
> >
> > Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
>
> Thank, I'll add your tag but would you mind quickly screening v3 again
> where I add "trusted.debug=0|1". And yes, your suggestion about extra
> warning makes sense.
>
> Let's make this safe as possible. Mistakes do happen... and then those
> measures pay off :-)
E.g., in 2026 world perfectly realistic scenario is "agentic devops
team" (unfortunately), which might debug trusted keys issue, and leave
debug flag on. Thus, both warning you suggested and also boot option
for good measure do actually leverage risks involved.
BR, Jarkko
^ permalink raw reply
* Re: [RFC PATCH v4 00/19] Support socket access-control
From: Mickaël Salaün @ 2026-04-08 10:26 UTC (permalink / raw)
To: Mikhail Ivanov
Cc: gnoack, willemdebruijn.kernel, matthieu, linux-security-module,
netdev, netfilter-devel, yusongping, artem.kuzin,
konstantin.meskhidze
In-Reply-To: <20251118134639.3314803-1-ivanov.mikhail1@huawei-partners.com>
Hi Mikhail,
On Tue, Nov 18, 2025 at 09:46:20PM +0800, Mikhail Ivanov wrote:
> Hello! This is v4 RFC patch dedicated to socket protocols restriction.
>
> It is based on the landlock's mic-next branch on top of Linux 6.16-rc2
> kernel version.
>
> Objective
> =========
> Extend Landlock with a mechanism to restrict any set of protocols in
> a sandboxed process.
>
> Closes: https://github.com/landlock-lsm/linux/issues/6
>
> Motivation
> ==========
> Landlock implements the `LANDLOCK_RULE_NET_PORT` rule type, which provides
> fine-grained control of actions for a specific protocol. Any action or
> protocol that is not supported by this rule can not be controlled. As a
> result, protocols for which fine-grained control is not supported can be
> used in a sandboxed system and lead to vulnerabilities or unexpected
> behavior.
>
> Controlling the protocols used will allow to use only those that are
> necessary for the system and/or which have fine-grained Landlock control
> through others types of rules (e.g. TCP bind/connect control with
> `LANDLOCK_RULE_NET_PORT`, UNIX bind control with
> `LANDLOCK_RULE_PATH_BENEATH`).
>
> Consider following examples:
> * Server may want to use only TCP sockets for which there is fine-grained
> control of bind(2) and connect(2) actions [1].
> * System that does not need a network or that may want to disable network
> for security reasons (e.g. [2]) can achieve this by restricting the use
> of all possible protocols.
>
> [1] https://lore.kernel.org/all/ZJvy2SViorgc+cZI@google.com/
> [2] https://cr.yp.to/unix/disablenetwork.html
>
> Implementation
> ==============
> This patchset adds control over the protocols used by implementing a
> restriction of socket creation. This is possible thanks to the new type
> of rule - `LANDLOCK_RULE_SOCKET`, that allows to restrict actions on
> sockets, and a new access right - `LANDLOCK_ACCESS_SOCKET_CREATE`, that
> corresponds to user space sockets creation. The key in this rule
> corresponds to communication protocol signature from socket(2) syscall.
FYI, I sent a new patch series that adds a handled_perm field to
rulesets:
https://lore.kernel.org/all/20260312100444.2609563-6-mic@digikod.net/
See also the rationale:
https://lore.kernel.org/all/20260312100444.2609563-12-mic@digikod.net/
I think that would work well with the socket creation permission. WDYT?
Do you think you'll be able to continue this work or would you like me
or Günther to complete the remaining last bits (while of course keeping
you as the main author)?
>
> The right to create a socket is checked in the LSM hook which is called
> in the __sock_create method. The following user space operations are
> subject to this check: socket(2), socketpair(2), io_uring(7).
>
> `LANDLOCK_ACCESS_SOCKET_CREATE` does not restrict socket creation
> performed by accept(2), because created socket is used for messaging
> between already existing endpoints.
>
> Design discussion
> ===================
> 1. Should `SCTP_SOCKOPT_PEELOFF` and socketpair(2) be restricted?
>
> SCTP socket can be connected to a multiple endpoints (one-to-many
> relation). Calling setsockopt(2) on such socket with option
> `SCTP_SOCKOPT_PEELOFF` detaches one of existing connections to a separate
> UDP socket. This detach is currently restrictable.
>
> Same applies for the socketpair(2) syscall. It was noted that denying
> usage of socketpair(2) in sandboxed environment may be not meaninful [1].
>
> Currently both operations use general socket interface to create sockets.
> Therefore it's not possible to distinguish between socket(2) and those
> operations inside security_socket_create LSM hook which is currently
> used for protocols restriction. Providing such separation may require
> changes in socket layer (eg. in __sock_create) interface which may not be
> acceptable.
>
> [1] https://lore.kernel.org/all/ZurZ7nuRRl0Zf2iM@google.com/
>
> Code coverage
> =============
> Code coverage(gcov) report with the launch of all the landlock selftests:
> * security/landlock:
> lines......: 94.0% (1200 of 1276 lines)
> functions..: 95.0% (134 of 141 functions)
>
> * security/landlock/socket.c:
> lines......: 100.0% (56 of 56 lines)
> functions..: 100.0% (5 of 5 functions)
>
> Currently landlock-test-tools fails on mini.kernel_socket test due to lack
> of SMC protocol support.
>
> General changes v3->v4
> ======================
> * Implementation
> * Adds protocol field to landlock_socket_attr.
> * Adds protocol masks support via wildcards values in
> landlock_socket_attr.
> * Changes LSM hook used from socket_post_create to socket_create.
> * Changes protocol ranges acceptable by socket rules.
> * Adds audit support.
> * Changes ABI version to 8.
> * Tests
> * Adds 5 new tests:
> * mini.rule_with_wildcard, protocol_wildcard.access,
> mini.ruleset_with_wildcards_overlap:
> verify rulesets containing rules with wildcard values.
> * tcp_protocol.alias_restriction: verify that Landlock doesn't
> perform protocol mappings.
> * audit.socket_create: tests audit denial logging.
> * Squashes tests corresponding to Landlock rule adding to a single commit.
> * Documentation
> * Refactors Documentation/userspace-api/landlock.rst.
> * Commits
> * Rebases on mic-next.
> * Refactors commits.
>
> Previous versions
> =================
> v3: https://lore.kernel.org/all/20240904104824.1844082-1-ivanov.mikhail1@huawei-partners.com/
> v2: https://lore.kernel.org/all/20240524093015.2402952-1-ivanov.mikhail1@huawei-partners.com/
> v1: https://lore.kernel.org/all/20240408093927.1759381-1-ivanov.mikhail1@huawei-partners.com/
>
> Mikhail Ivanov (19):
> landlock: Support socket access-control
> selftests/landlock: Test creating a ruleset with unknown access
> selftests/landlock: Test adding a socket rule
> selftests/landlock: Testing adding rule with wildcard value
> selftests/landlock: Test acceptable ranges of socket rule key
> landlock: Add hook on socket creation
> selftests/landlock: Test basic socket restriction
> selftests/landlock: Test network stack error code consistency
> selftests/landlock: Test overlapped rulesets with rules of protocol
> ranges
> selftests/landlock: Test that kernel space sockets are not restricted
> selftests/landlock: Test protocol mappings
> selftests/landlock: Test socketpair(2) restriction
> selftests/landlock: Test SCTP peeloff restriction
> selftests/landlock: Test that accept(2) is not restricted
> lsm: Support logging socket common data
> landlock: Log socket creation denials
> selftests/landlock: Test socket creation denial log for audit
> samples/landlock: Support socket protocol restrictions
> landlock: Document socket rule type support
>
> Documentation/userspace-api/landlock.rst | 48 +-
> include/linux/lsm_audit.h | 8 +
> include/uapi/linux/landlock.h | 60 +-
> samples/landlock/sandboxer.c | 118 +-
> security/landlock/Makefile | 2 +-
> security/landlock/access.h | 3 +
> security/landlock/audit.c | 12 +
> security/landlock/audit.h | 1 +
> security/landlock/limits.h | 4 +
> security/landlock/ruleset.c | 37 +-
> security/landlock/ruleset.h | 46 +-
> security/landlock/setup.c | 2 +
> security/landlock/socket.c | 198 +++
> security/landlock/socket.h | 20 +
> security/landlock/syscalls.c | 61 +-
> security/lsm_audit.c | 4 +
> tools/testing/selftests/landlock/base_test.c | 2 +-
> tools/testing/selftests/landlock/common.h | 14 +
> tools/testing/selftests/landlock/config | 47 +
> tools/testing/selftests/landlock/net_test.c | 11 -
> .../selftests/landlock/protocols_define.h | 169 +++
> .../testing/selftests/landlock/socket_test.c | 1169 +++++++++++++++++
> 22 files changed, 1990 insertions(+), 46 deletions(-)
> create mode 100644 security/landlock/socket.c
> create mode 100644 security/landlock/socket.h
> create mode 100644 tools/testing/selftests/landlock/protocols_define.h
> create mode 100644 tools/testing/selftests/landlock/socket_test.c
>
>
> base-commit: 6dde339a3df80a57ac3d780d8cfc14d9262e2acd
> --
> 2.34.1
>
>
^ permalink raw reply
* Re: LSM: Whiteout chardev creation sidesteps mknod hook
From: Mickaël Salaün @ 2026-04-08 11:01 UTC (permalink / raw)
To: Günther Noack
Cc: Christian Brauner, Paul Moore, linux-security-module,
John Johansen, Georgia Garcia, Kentaro Takeda, Tetsuo Handa
In-Reply-To: <adUBCQXrt7kmgqJT@google.com>
On Tue, Apr 07, 2026 at 03:05:13PM +0200, Günther Noack wrote:
> Hello Christian, Paul, Mickaël and LSM maintainers!
>
> I discovered the following bug in Landlock, which potentially also
> affects other LSMs:
>
> With renameat2(2)'s RENAME_WHITEOUT flag, it is possible to create a
> "whiteout object" at the source of the rename. Whiteout objects are
> character devices with major/minor (0, 0) -- these devices are not
> bound to any driver, so they are harmless, but still, the creation of
> these files can sidestep the LANDLOCK_ACCESS_FS_MAKE_CHAR access right
> in Landlock.
Any way to "write" on the filesystem should properly be controlled. The
man page says that RENAME_WHITEOUT requires CAP_MKNOD, however, looking
at vfs_mknod(), there is an explicit exception to not check CAP_MKNOD
for whiteout devices. See commit a3c751a50fe6 ("vfs: allow unprivileged
whiteout creation").
>
>
> I am unconvinced which is the right fix here -- do you have an opinion
> on this from the VFS/LSM side?
>
>
> Option 1: Make filesystems call security_path_mknod() during RENAME_WHITEOUT?
This is the right semantic.
>
> Do it in the VFS rename hook.
>
> * Pro: Fixes it for all LSMs
> * Con: Call would have to be done in multiple filesystems
That would not work.
>
>
> Option 2: Handle it in security_{path,inode}_rename()
>
> Make Landlock handle it in security_inode_rename() by looking for the
> RENAME_WHITEOUT flag.
>
> * Con: Operation should only be denied if the file system even
> implements RENAME_WHITEOUT, and we would have to maintain a list of
> affected filesystems for that. (That feels like solving it at the
> wrong layer of abstraction.)
Why would we need to maintain such list? If it's only about the errno,
well, that would not be perfect be ok with a proper doc.
I'm mostly worried that there might be other (future) call paths to
create whiteout devices.
I think option 2 would be the most practical approach for Landlock, with
a new LANDLOCK_ACCESS_FS_MAKE_WHITEOUT right.
I'm also wondering how are the chances that other kind of special file
type like a whiteout device could come up in the future. Any guess
Christian?
> * Con: Unclear whether other LSMs need a similar fix
I guess at least AppArmor and Tomoyo would consider that an issue.
>
>
> Option 3: Declare that this is working as intended?
We need to be able to controle any file creation, which is not currently
the case because of this whiteout exception.
>
> * Pro: (0, 0) is not a "real" character device
>
>
> In cases 1 and 2, we'd likely need to double check that we are not
> breaking existing scenarios involving OverlayFS, by suddenly requiring
> a more lax policy for creating character devices on these directories.
>
> Please let me know what you think. I'm specifically interested in:
>
> 1. Christian: What is the appropriate way to do this VFS wise?
> 2. LSM maintainers: Is this a bug that affects other LSMs as well?
>
> Thanks,
> —Günther
>
> P.S.: For full transparency, I found this bug by pointing Google
> Gemini at the Landlock codebase.
>
^ permalink raw reply
* Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets
From: Justin Suess @ 2026-04-08 11:41 UTC (permalink / raw)
To: Ihor Solodrai
Cc: ast, daniel, andrii, kpsingh, paul, mic, viro, brauner, kees,
gnoack, jack, jmorris, serge, song, yonghong.song, martin.lau, m,
eddyz87, john.fastabend, sdf, skhan, bpf, linux-security-module,
linux-kernel, linux-fsdevel
In-Reply-To: <5dfadfd4-ea02-4c3f-8d01-5d979ea06747@linux.dev>
On Tue, Apr 07, 2026 at 09:40:07PM -0700, Ihor Solodrai wrote:
>
>
> On 4/7/26 1:01 PM, Justin Suess wrote:
> > Hello,
> >
> > This series lets sleepable BPF LSM programs apply an existing,
> > userspace-created Landlock ruleset to a program during exec.
> >
> > The goal is not to move Landlock policy definition into BPF, nor to create a
> > second policy engine. Instead, BPF is used only to select when an already
> > valid Landlock ruleset should be applied, based on runtime exec context.
> >
> > Background
> > ===
> >
> > Landlock is primarily a syscall-driven, unprivileged-first LSM. That model
> > works well when the application being sandboxed can create and enforce its own
> > rulesets, or when a trusted launcher can impose restrictions directly before
> > running a trusted target.
> >
> > That becomes harder when the target program is not under first-party control,
> > for example:
> >
> > 1. third-party binaries,
> > 2. unmodified container images,
> > 3. programs reached through shells, wrappers, or service managers, and
> > 4. user-supplied or otherwise untrusted code.
> >
> > In these cases, an external supervisor may want to apply a Landlock ruleset to
> > the final executed program, while leaving unrelated parents or helper
> > processes alone.
> >
> > Why external sandboxing is awkward today
> > ===
> >
> > There are two recurring problems.
> >
> > First, userspace cannot reliably predict every file a target may need across
> > different systems, packaging layouts, and runtime conditions. Shared
> > libraries, configuration files, interpreters, and helper binaries often depend
> > on details that are only known at runtime.
> >
> > Second, Landlock inheritance is intentionally one-way. Once a task is
> > restricted, descendants inherit that domain and may only become more
> > restricted. This is exactly what Landlock should do, but it makes external
> > sandboxing awkward when the program of interest is buried inside a larger exec
> > chain. Applying restrictions too early can affect unrelated intermediates;
> > applying them too late misses the target entirely.
> >
> > This series addresses that target-selection problem.
> >
> > Overview
> > ===
> >
> > This series adds a small BPF-to-Landlock bridge:
> >
> > 1. userspace creates a normal Landlock ruleset through the existing ABI;
> > 2. userspace inserts that ruleset FD into a new
> > BPF_MAP_TYPE_LANDLOCK_RULESET map;
> > 3. a sleepable BPF LSM program attached to an exec-time hook looks up the
> > ruleset; and
> > 4. the program calls a kfunc to apply that ruleset to the new program's
> > credentials before exec completes.
> >
> > The important point is that BPF does not create, inspect, or mutate Landlock
> > policy here. It only decides whether to apply a ruleset that was already
> > created and validated through Landlock's existing userspace API.
> >
> > Interface
> > ===
> >
> > The series adds:
> >
> > 1. bpf_landlock_restrict_binprm(), which applies a referenced ruleset to
> > struct linux_binprm credentials;
> > 2. bpf_landlock_put_ruleset(), which releases a referenced ruleset; and
> > 3. BPF_MAP_TYPE_LANDLOCK_RULESET, a specialized map type for holding
> > references to Landlock rulesets originating from userspace file
> > descriptors.
> > 4. A new field in the linux_binprm struct to enable application of
> > task_set_no_new_privs once execution is beyond the point of no return.
> >
> > The kfuncs are restricted to sleepable BPF LSM programs attached to
> > bprm_creds_for_exec and bprm_creds_from_file, which are the points where the
> > new program's credentials may still be updated safely.
> >
> > This series also adds LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS. On the BPF path,
> > this is staged through the exec context and committed only after exec reaches
> > point-of-no-return. This avoids side effects on failed executions while
> > ensuring that the resulting task cannot gain more privileges through later exec
> > transitions. This is done through the set_nnp_on_point_of_no_return field.
> >
> > This has a little subtlety: LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS in the BPF
> > path will not stop the current execution from escalating at all; only subsequent
> > ones. This is intentional to allow landlock policies to be applied through a
> > setuid transition for instance, without affecting the current escalation.
> >
> > Semantics
> > ===
> >
> > This proposal is intended to preserve Landlock semantics as much as practical
> > for an exec-time BPF attachment model:
> >
> > 1. only pre-existing Landlock rulesets may be applied;
> > 2. BPF cannot construct, inspect, or modify rulesets;
> > 3. enforcement still happens before the new program begins execution;
> > 4. normal Landlock inheritance, layering, and future composition remain
> > unchanged; and
> > 5. this does not bypass Landlock's privilege checks for applying Landlock
> > rulesets.
> >
> > In other words, BPF acts as an external selector for when to apply Landlock,
> > not as a replacement for Landlock's enforcement engine.
> >
> > All behavior, future access rights, and previous access rights are designed
> > to automatically be supported from either BPF or existing syscall contexts.
> >
> > The main semantic difference is LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS on the BPF
> > path: it guarantees that the resulting task is pinned with no_new_privs before
> > it can perform later exec transitions, but it does not retroactively suppress
> > privilege gain for the current exec transition itself.
> >
> > The other exception to semantics is the LANDLOCK_RESTRICT_SELF_TSYNC flag.
> > (see Points of Feedback section)
> >
> > Patch layout
> > ===
> >
> > Patches 1-5 prepare the Landlock side by moving shared ruleset logic out of
> > syscalls.c, adding a no_new_privs flag for non-syscall callers, exposing
> > linux_binprm->set_nnp_on_point_of_no_return as an interface to set no_new_privs
> > on the point of no return, and making deferred ruleset destruction RCU-safe.
> >
> > Patches 6-10 add the BPF-facing pieces: the Landlock kfuncs, the new map type,
> > syscall handling for that map, and verifier support.
> >
> > Patches 11-15 add selftests and the small bpftool update needed for the new
> > map type.
> >
> > Patches 16-20 add docs and bump the ABI version and update MAINTAINERS.
> >
> > Feedback is especially welcome on the overall interface shape, the choice of
> > hooks, and the map semantics.
> >
> > Testing
> > ===
> >
> > This patch series has two portions of tests.
> >
> > One lives in the traditional Landlock selftests, for the new
> > LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS flag.
> >
> > The other suite lives under the BPF selftests, and this tests the Landlock
> > kfuncs and the new BPF_MAP_TYPE_LANDLOCK_RULESET.
> >
> > This patch series was run through BPF CI, the results of which are here. [1]
> >
> > All mentioned tests are passing, as well as the BPF CI.
> >
> > [1] : https://github.com/kernel-patches/bpf/pull/11562
>
> Hello Justin.
>
> I regret to disappoint you with a lame piece of feedback, but the
> series hasn't been picked up by automated BPF CI pipeline properly:
> https://github.com/kernel-patches/bpf/pull/11709
>
Apologies.
> I suggest you rebase on top of bpf-next/master [1], and re-submit to
> the mailing list with a bpf-next tag in subject:
> "[RFC PATCH bpf-next ...] bpf: ..."
>
No problem. Sorry about that I based it off the Landlock-next branch.
My fault, I thought the CI was to be manually initiated... oh well.
I'll resubmit soon. Looks like a perfectly clean rebase luckily.
> I'm pretty sure AI bot will find something annoying to address.
>
> Other than that, please be patient. It'll probably take a while for
> maintainers and reviewers to digest this work before anyone can
> meaningfully comment. Thanks!
>
Thank you for your time and help!
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
>
> >
> > Points of Feedback
> > ===
> >
> > First, the new set_nnp_on_point_of_no_return field in struct linux_binprm.
> > This field was needed to request that task_set_no_new_privs be set during an
> > execution, but only after the execution has proceeded beyond the point of no
> > return. I couldn't find a way to express this semantic without adding a new
> > bitfield to struct linux_binprm and a conditional in fs/exec.c. Please see
> > patch 2.
> >
> > Feedback on the BPF testing harness, which was generated with AI assistance as
> > disclosed in the commit footer, is welcomed. I have only limited familiarity
> > with BPF testing practices. These tests were made with strong human supervision.
> > See patches 14 and 15.
> >
> > Feedback on the NO_NEW_PRIVS situation is also welcomed. Because task_set_no_new_privs()
> > would otherwise leak state on failed executions or AT_EXECVE_CHECK, this series
> > stages no_new_privs through the exec context and only commits it after
> > point-of-no-return. This preserves failure behavior while still ensuring that
> > the resulting task cannot elevate further through later exec transitions.
> > When called from bprm_creds_from_file, this does not retroactively change the
> > privilege outcome of the current exec transition itself.
> >
> > See patch 2 and 3.
> >
> > Next, the RCU in the landlock_ruleset. Existing BPF maps use RCU to make sure maps
> > holding references stay valid. I altered the landlock ruleset to use rcu_work
> > to make sure that the rcu is synchronized before putting on a ruleset, and
> > acquire the rcu in the arraymap implementation. See patches 5-10.
> >
> > Next, the semantics of the map. What operations should be supported from BPF
> > and userspace and what data types should they return? I consider the struct
> > bpf_landlock_ruleset to be opaque. Userspace can add items to the map via the
> > fd, delete items by their index, and BPF can delete and lookup items by their
> > index. Items cannot be updated, only swapped.
> >
> > Finally, the handling of the LANDLOCK_RESTRICT_SELF_TSYNC flag. This flag has
> > no meaning in a pre-execution context, as the credentials during the designated
> > LSM hooks (bprm_creds_for_exec/creds_from_file) still represent the pre-execution
> > task. Therefore, this flag is invalidated and attempting to use it with
> > bpf_landlock_restrict_binprm will return -EINVAL. Otherwise, the flag would
> > result in applying the landlock ruleset to the wrong target in addition to the
> > intended one. (see patch 2). This behavior is validated with selftests.
> >
> > Existing works / Credits
> > ===
> >
> > Mickaël Salaün created patchsets adding BPF tracepoints for landlock in [2] [3].
> >
> > Mickaël also gave feedback on this feature and the idea in this GitHub thread. [4]
> >
> > Günther Noack initially received and provided initial feedback on this idea as
> > an early prototype.
> >
> > Liz Rice, author of "Learning eBPF: Programming the Linux Kernel for Enhanced
> > Observability, Networking, and Security" provided background and inspired me to
> > experiment with BPF and the BPF LSM. [5]
> >
> > [2] : https://lore.kernel.org/all/20250523165741.693976-1-mic@digikod.net/
> > [3] : https://lore.kernel.org/linux-security-module/20260406143717.1815792-1-mic@digikod.net/
> > [4] : https://github.com/landlock-lsm/linux/issues/56
> > [5] : https://wellesleybooks.com/book/9781098135126
> >
> > Kind Regards,
> > Justin Suess
> >
> > Justin Suess (20):
> > landlock: Move operations from syscall into ruleset code
> > execve: Add set_nnp_on_point_of_no_return
> > landlock: Implement LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> > selftests/landlock: Cover LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> > landlock: Make ruleset deferred free RCU safe
> > bpf: lsm: Add Landlock kfuncs
> > bpf: arraymap: Implement Landlock ruleset map
> > bpf: Add Landlock ruleset map type
> > bpf: syscall: Handle Landlock ruleset maps
> > bpf: verifier: Add Landlock ruleset map support
> > selftests/bpf: Add Landlock kfunc declarations
> > selftests/landlock: Rename gettid wrapper for BPF reuse
> > selftests/bpf: Enable Landlock in selftests kernel.
> > selftests/bpf: Add Landlock kfunc test program
> > selftests/bpf: Add Landlock kfunc test runner
> > landlock: Bump ABI version
> > tools: bpftool: Add documentation for landlock_ruleset
> > landlock: Document LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> > bpf: Document BPF_MAP_TYPE_LANDLOCK_RULESET
> > MAINTAINERS: update entry for the Landlock subsystem
> >
> > Documentation/bpf/map_landlock_ruleset.rst | 181 +++++
> > Documentation/userspace-api/landlock.rst | 22 +-
> > MAINTAINERS | 4 +
> > fs/exec.c | 8 +
> > include/linux/binfmts.h | 7 +-
> > include/linux/bpf_lsm.h | 15 +
> > include/linux/bpf_types.h | 1 +
> > include/linux/landlock.h | 92 +++
> > include/uapi/linux/bpf.h | 1 +
> > include/uapi/linux/landlock.h | 14 +
> > kernel/bpf/arraymap.c | 67 ++
> > kernel/bpf/bpf_lsm.c | 145 ++++
> > kernel/bpf/syscall.c | 4 +-
> > kernel/bpf/verifier.c | 15 +-
> > samples/landlock/sandboxer.c | 7 +-
> > security/landlock/limits.h | 2 +-
> > security/landlock/ruleset.c | 198 ++++-
> > security/landlock/ruleset.h | 25 +-
> > security/landlock/syscalls.c | 158 +---
> > .../bpf/bpftool/Documentation/bpftool-map.rst | 2 +-
> > tools/bpf/bpftool/map.c | 2 +-
> > tools/include/uapi/linux/bpf.h | 1 +
> > tools/lib/bpf/libbpf.c | 1 +
> > tools/lib/bpf/libbpf_probes.c | 6 +
> > tools/testing/selftests/bpf/bpf_kfuncs.h | 20 +
> > tools/testing/selftests/bpf/config | 5 +
> > tools/testing/selftests/bpf/config.x86_64 | 1 -
> > .../bpf/prog_tests/landlock_kfuncs.c | 733 ++++++++++++++++++
> > .../selftests/bpf/progs/landlock_kfuncs.c | 92 +++
> > tools/testing/selftests/landlock/base_test.c | 10 +-
> > tools/testing/selftests/landlock/common.h | 28 +-
> > tools/testing/selftests/landlock/fs_test.c | 103 +--
> > tools/testing/selftests/landlock/net_test.c | 55 +-
> > .../testing/selftests/landlock/ptrace_test.c | 14 +-
> > .../landlock/scoped_abstract_unix_test.c | 51 +-
> > .../selftests/landlock/scoped_base_variants.h | 23 +
> > .../selftests/landlock/scoped_common.h | 5 +-
> > .../selftests/landlock/scoped_signal_test.c | 30 +-
> > tools/testing/selftests/landlock/wrappers.h | 2 +-
> > 39 files changed, 1877 insertions(+), 273 deletions(-)
> > create mode 100644 Documentation/bpf/map_landlock_ruleset.rst
> > create mode 100644 include/linux/landlock.h
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/landlock_kfuncs.c
> > create mode 100644 tools/testing/selftests/bpf/progs/landlock_kfuncs.c
> >
> >
> > base-commit: 8c6a27e02bc55ab110d1828610048b19f903aaec
>
^ permalink raw reply
* [PATCH] security: remove BUG_ON in security_skb_classify_flow
From: Jiayuan Chen @ 2026-04-08 11:42 UTC (permalink / raw)
To: linux-security-module, paul
Cc: jmorris, serge, linux-kernel, Jiayuan Chen, Kaiyan Mei, Yinhao Hu,
Dongliang Mu
A BPF program attached to the xfrm_decode_session hook can return a
non-zero value, which causes BUG_ON(rc) in security_skb_classify_flow()
to trigger a kernel panic.
Remove the BUG_ON and change the return type from void to int, so that
callers can optionally handle the error.
Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
Reported-by: Dongliang Mu <dzm91@hust.edu.cn>
Closes: https://lore.kernel.org/bpf/4c4d04ba.6c12b.19c039b69e6.Coremail.kaiyanm@hust.edu.cn/
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
include/linux/security.h | 7 ++++---
security/security.c | 16 +++++++++++-----
2 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/include/linux/security.h b/include/linux/security.h
index ee88dd2d2d1f..6d210dc4c649 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1975,7 +1975,7 @@ int security_xfrm_state_pol_flow_match(struct xfrm_state *x,
struct xfrm_policy *xp,
const struct flowi_common *flic);
int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid);
-void security_skb_classify_flow(struct sk_buff *skb, struct flowi_common *flic);
+int security_skb_classify_flow(struct sk_buff *skb, struct flowi_common *flic);
#else /* CONFIG_SECURITY_NETWORK_XFRM */
@@ -2038,9 +2038,10 @@ static inline int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid)
return 0;
}
-static inline void security_skb_classify_flow(struct sk_buff *skb,
- struct flowi_common *flic)
+static inline int security_skb_classify_flow(struct sk_buff *skb,
+ struct flowi_common *flic)
{
+ return 0;
}
#endif /* CONFIG_SECURITY_NETWORK_XFRM */
diff --git a/security/security.c b/security/security.c
index a26c1474e2e4..26a34eb363c2 100644
--- a/security/security.c
+++ b/security/security.c
@@ -4990,12 +4990,18 @@ int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid)
return call_int_hook(xfrm_decode_session, skb, secid, 1);
}
-void security_skb_classify_flow(struct sk_buff *skb, struct flowi_common *flic)
+/**
+ * security_skb_classify_flow() - Set the flow's secid from the security label
+ * @skb: packet
+ * @flic: flow common structure to set
+ *
+ * Decode the packet in @skb and set the flow's secid in @flic.
+ *
+ * Return: Return 0 if successful.
+ */
+int security_skb_classify_flow(struct sk_buff *skb, struct flowi_common *flic)
{
- int rc = call_int_hook(xfrm_decode_session, skb, &flic->flowic_secid,
- 0);
-
- BUG_ON(rc);
+ return call_int_hook(xfrm_decode_session, skb, &flic->flowic_secid, 0);
}
EXPORT_SYMBOL(security_skb_classify_flow);
#endif /* CONFIG_SECURITY_NETWORK_XFRM */
--
2.43.0
^ permalink raw reply related
* Re: LSM: Whiteout chardev creation sidesteps mknod hook
From: Mickaël Salaün @ 2026-04-08 12:24 UTC (permalink / raw)
To: Christian Brauner
Cc: Günther Noack, Paul Moore, linux-security-module,
John Johansen, Georgia Garcia, Kentaro Takeda, Tetsuo Handa,
linux-fsdevel, Alejandro Colomar
In-Reply-To: <20260408.beu1Eing5aFo@digikod.net>
CCing fsdevel and Alejandro.
On Wed, Apr 08, 2026 at 01:01:31PM +0200, Mickaël Salaün wrote:
> On Tue, Apr 07, 2026 at 03:05:13PM +0200, Günther Noack wrote:
> > Hello Christian, Paul, Mickaël and LSM maintainers!
> >
> > I discovered the following bug in Landlock, which potentially also
> > affects other LSMs:
> >
> > With renameat2(2)'s RENAME_WHITEOUT flag, it is possible to create a
> > "whiteout object" at the source of the rename. Whiteout objects are
> > character devices with major/minor (0, 0) -- these devices are not
> > bound to any driver, so they are harmless, but still, the creation of
> > these files can sidestep the LANDLOCK_ACCESS_FS_MAKE_CHAR access right
> > in Landlock.
>
> Any way to "write" on the filesystem should properly be controlled. The
> man page says that RENAME_WHITEOUT requires CAP_MKNOD, however, looking
> at vfs_mknod(), there is an explicit exception to not check CAP_MKNOD
> for whiteout devices. See commit a3c751a50fe6 ("vfs: allow unprivileged
> whiteout creation").
>
> >
> >
> > I am unconvinced which is the right fix here -- do you have an opinion
> > on this from the VFS/LSM side?
> >
> >
> > Option 1: Make filesystems call security_path_mknod() during RENAME_WHITEOUT?
>
> This is the right semantic.
>
> >
> > Do it in the VFS rename hook.
> >
> > * Pro: Fixes it for all LSMs
> > * Con: Call would have to be done in multiple filesystems
>
> That would not work.
>
> >
> >
> > Option 2: Handle it in security_{path,inode}_rename()
> >
> > Make Landlock handle it in security_inode_rename() by looking for the
> > RENAME_WHITEOUT flag.
> >
> > * Con: Operation should only be denied if the file system even
> > implements RENAME_WHITEOUT, and we would have to maintain a list of
> > affected filesystems for that. (That feels like solving it at the
> > wrong layer of abstraction.)
>
> Why would we need to maintain such list? If it's only about the errno,
> well, that would not be perfect be ok with a proper doc.
>
> I'm mostly worried that there might be other (future) call paths to
> create whiteout devices.
>
> I think option 2 would be the most practical approach for Landlock, with
> a new LANDLOCK_ACCESS_FS_MAKE_WHITEOUT right.
>
> I'm also wondering how are the chances that other kind of special file
> type like a whiteout device could come up in the future. Any guess
> Christian?
>
> > * Con: Unclear whether other LSMs need a similar fix
>
> I guess at least AppArmor and Tomoyo would consider that an issue.
>
> >
> >
> > Option 3: Declare that this is working as intended?
>
> We need to be able to controle any file creation, which is not currently
> the case because of this whiteout exception.
>
> >
> > * Pro: (0, 0) is not a "real" character device
> >
> >
> > In cases 1 and 2, we'd likely need to double check that we are not
> > breaking existing scenarios involving OverlayFS, by suddenly requiring
> > a more lax policy for creating character devices on these directories.
> >
> > Please let me know what you think. I'm specifically interested in:
> >
> > 1. Christian: What is the appropriate way to do this VFS wise?
> > 2. LSM maintainers: Is this a bug that affects other LSMs as well?
> >
> > Thanks,
> > —Günther
> >
> > P.S.: For full transparency, I found this bug by pointing Google
> > Gemini at the Landlock codebase.
> >
^ permalink raw reply
* Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets
From: Mickaël Salaün @ 2026-04-08 14:00 UTC (permalink / raw)
To: Justin Suess
Cc: ast, daniel, andrii, kpsingh, paul, viro, brauner, kees, gnoack,
jack, jmorris, serge, song, yonghong.song, martin.lau, m, eddyz87,
john.fastabend, sdf, skhan, bpf, linux-security-module,
linux-kernel, linux-fsdevel, Frederick Lawler
In-Reply-To: <20260407200157.3874806-1-utilityemal77@gmail.com>
Thanks for this RFC.
On Tue, Apr 07, 2026 at 04:01:22PM -0400, Justin Suess wrote:
> Hello,
>
> This series lets sleepable BPF LSM programs apply an existing,
> userspace-created Landlock ruleset to a program during exec.
>
> The goal is not to move Landlock policy definition into BPF, nor to create a
> second policy engine. Instead, BPF is used only to select when an already
> valid Landlock ruleset should be applied, based on runtime exec context.
>
> Background
> ===
>
> Landlock is primarily a syscall-driven, unprivileged-first LSM. That model
> works well when the application being sandboxed can create and enforce its own
> rulesets, or when a trusted launcher can impose restrictions directly before
> running a trusted target.
>
> That becomes harder when the target program is not under first-party control,
> for example:
>
> 1. third-party binaries,
> 2. unmodified container images,
> 3. programs reached through shells, wrappers, or service managers, and
> 4. user-supplied or otherwise untrusted code.
>
> In these cases, an external supervisor may want to apply a Landlock ruleset to
> the final executed program, while leaving unrelated parents or helper
> processes alone.
>
> Why external sandboxing is awkward today
> ===
>
> There are two recurring problems.
>
> First, userspace cannot reliably predict every file a target may need across
> different systems, packaging layouts, and runtime conditions. Shared
> libraries, configuration files, interpreters, and helper binaries often depend
> on details that are only known at runtime.
Agreed, it would make sense to leverage eBPF for this context
identification rather than implementing a Landlock-specfic feature.
>
> Second, Landlock inheritance is intentionally one-way. Once a task is
> restricted, descendants inherit that domain and may only become more
> restricted. This is exactly what Landlock should do, but it makes external
> sandboxing awkward when the program of interest is buried inside a larger exec
> chain. Applying restrictions too early can affect unrelated intermediates;
> applying them too late misses the target entirely.
This makes sense too.
>
> This series addresses that target-selection problem.
>
> Overview
> ===
>
> This series adds a small BPF-to-Landlock bridge:
>
> 1. userspace creates a normal Landlock ruleset through the existing ABI;
> 2. userspace inserts that ruleset FD into a new
> BPF_MAP_TYPE_LANDLOCK_RULESET map;
> 3. a sleepable BPF LSM program attached to an exec-time hook looks up the
> ruleset; and
> 4. the program calls a kfunc to apply that ruleset to the new program's
> credentials before exec completes.
>
> The important point is that BPF does not create, inspect, or mutate Landlock
> policy here. It only decides whether to apply a ruleset that was already
> created and validated through Landlock's existing userspace API.
I like this approach. It makes it possible for users enforce Landlock
security policies on arbitrary new executions. Sandboxing at this
specific point is the best time because it ensures a consistency for the
whole lifetime of the process, whereas applying new restriction in the
middle of an execution would make the process unstable (if the request
doesn't come from the process itself).
>
> Interface
> ===
>
> The series adds:
>
> 1. bpf_landlock_restrict_binprm(), which applies a referenced ruleset to
> struct linux_binprm credentials;
> 2. bpf_landlock_put_ruleset(), which releases a referenced ruleset; and
> 3. BPF_MAP_TYPE_LANDLOCK_RULESET, a specialized map type for holding
> references to Landlock rulesets originating from userspace file
> descriptors.
> 4. A new field in the linux_binprm struct to enable application of
> task_set_no_new_privs once execution is beyond the point of no return.
This "beyond the point of no return" is indeed important, and it would
be nice to also have this property for Landlock restriction i.e., only
create a Landlock domain if we know that the execution will succeed (or
if the caller will exit). This is especially important for
logging/tracing event consistency.
>
> The kfuncs are restricted to sleepable BPF LSM programs attached to
> bprm_creds_for_exec and bprm_creds_from_file, which are the points where the
> new program's credentials may still be updated safely.
>
> This series also adds LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS. On the BPF path,
> this is staged through the exec context and committed only after exec reaches
> point-of-no-return. This avoids side effects on failed executions while
> ensuring that the resulting task cannot gain more privileges through later exec
> transitions. This is done through the set_nnp_on_point_of_no_return field.
>
> This has a little subtlety: LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS in the BPF
> path will not stop the current execution from escalating at all; only subsequent
> ones.
This makes sense too, but it needs to be documented.
> This is intentional to allow landlock policies to be applied through a
s/landlock/Landlock/g in every text/comment/commit description please.
> setuid transition for instance, without affecting the current escalation.
>
> Semantics
> ===
>
> This proposal is intended to preserve Landlock semantics as much as practical
> for an exec-time BPF attachment model:
>
> 1. only pre-existing Landlock rulesets may be applied;
> 2. BPF cannot construct, inspect, or modify rulesets;
Inspection will be possible with tracepoints, but it is orthogonal to
this series.
> 3. enforcement still happens before the new program begins execution;
> 4. normal Landlock inheritance, layering, and future composition remain
> unchanged; and
> 5. this does not bypass Landlock's privilege checks for applying Landlock
> rulesets.
>
> In other words, BPF acts as an external selector for when to apply Landlock,
> not as a replacement for Landlock's enforcement engine.
>
> All behavior, future access rights, and previous access rights are designed
> to automatically be supported from either BPF or existing syscall contexts.
>
> The main semantic difference is LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS on the BPF
> path: it guarantees that the resulting task is pinned with no_new_privs before
> it can perform later exec transitions, but it does not retroactively suppress
> privilege gain for the current exec transition itself.
>
> The other exception to semantics is the LANDLOCK_RESTRICT_SELF_TSYNC flag.
> (see Points of Feedback section)
>
> Patch layout
> ===
>
> Patches 1-5 prepare the Landlock side by moving shared ruleset logic out of
> syscalls.c, adding a no_new_privs flag for non-syscall callers, exposing
> linux_binprm->set_nnp_on_point_of_no_return as an interface to set no_new_privs
> on the point of no return, and making deferred ruleset destruction RCU-safe.
>
> Patches 6-10 add the BPF-facing pieces: the Landlock kfuncs, the new map type,
> syscall handling for that map, and verifier support.
>
> Patches 11-15 add selftests and the small bpftool update needed for the new
> map type.
>
> Patches 16-20 add docs and bump the ABI version and update MAINTAINERS.
>
> Feedback is especially welcome on the overall interface shape, the choice of
> hooks, and the map semantics.
I'll review each patch separately, but this approach is promising.
I think it would be simpler to have a dedicated patch series for
LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS, and then send another series
specific to the eBPF side (kfunc, tests, doc...). I'm not sure what is
the best way to deal with dependencies across Landlock and BPF though.
What is the policy for BPF next wrt other next branches?
>
> Testing
> ===
>
> This patch series has two portions of tests.
>
> One lives in the traditional Landlock selftests, for the new
> LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS flag.
>
> The other suite lives under the BPF selftests, and this tests the Landlock
> kfuncs and the new BPF_MAP_TYPE_LANDLOCK_RULESET.
>
> This patch series was run through BPF CI, the results of which are here. [1]
>
> All mentioned tests are passing, as well as the BPF CI.
>
> [1] : https://github.com/kernel-patches/bpf/pull/11562
>
> Points of Feedback
> ===
>
> First, the new set_nnp_on_point_of_no_return field in struct linux_binprm.
> This field was needed to request that task_set_no_new_privs be set during an
> execution, but only after the execution has proceeded beyond the point of no
> return. I couldn't find a way to express this semantic without adding a new
> bitfield to struct linux_binprm and a conditional in fs/exec.c. Please see
> patch 2.
What about using security_bprm_committing_creds()?
>
> Feedback on the BPF testing harness, which was generated with AI assistance as
> disclosed in the commit footer, is welcomed. I have only limited familiarity
> with BPF testing practices. These tests were made with strong human supervision.
> See patches 14 and 15.
>
> Feedback on the NO_NEW_PRIVS situation is also welcomed. Because task_set_no_new_privs()
> would otherwise leak state on failed executions or AT_EXECVE_CHECK, this series
> stages no_new_privs through the exec context and only commits it after
> point-of-no-return. This preserves failure behavior while still ensuring that
> the resulting task cannot elevate further through later exec transitions.
> When called from bprm_creds_from_file, this does not retroactively change the
> privilege outcome of the current exec transition itself.
>
> See patch 2 and 3.
>
> Next, the RCU in the landlock_ruleset. Existing BPF maps use RCU to make sure maps
> holding references stay valid. I altered the landlock ruleset to use rcu_work
> to make sure that the rcu is synchronized before putting on a ruleset, and
> acquire the rcu in the arraymap implementation. See patches 5-10.
>
> Next, the semantics of the map. What operations should be supported from BPF
> and userspace and what data types should they return? I consider the struct
> bpf_landlock_ruleset to be opaque. Userspace can add items to the map via the
> fd, delete items by their index, and BPF can delete and lookup items by their
> index. Items cannot be updated, only swapped.
>
> Finally, the handling of the LANDLOCK_RESTRICT_SELF_TSYNC flag. This flag has
> no meaning in a pre-execution context, as the credentials during the designated
> LSM hooks (bprm_creds_for_exec/creds_from_file) still represent the pre-execution
> task. Therefore, this flag is invalidated and attempting to use it with
> bpf_landlock_restrict_binprm will return -EINVAL. Otherwise, the flag would
> result in applying the landlock ruleset to the wrong target in addition to the
> intended one. (see patch 2). This behavior is validated with selftests.
>
> Existing works / Credits
> ===
>
> Mickaël Salaün created patchsets adding BPF tracepoints for landlock in [2] [3].
>
> Mickaël also gave feedback on this feature and the idea in this GitHub thread. [4]
>
> Günther Noack initially received and provided initial feedback on this idea as
> an early prototype.
>
> Liz Rice, author of "Learning eBPF: Programming the Linux Kernel for Enhanced
> Observability, Networking, and Security" provided background and inspired me to
> experiment with BPF and the BPF LSM. [5]
>
> [2] : https://lore.kernel.org/all/20250523165741.693976-1-mic@digikod.net/
> [3] : https://lore.kernel.org/linux-security-module/20260406143717.1815792-1-mic@digikod.net/
> [4] : https://github.com/landlock-lsm/linux/issues/56
> [5] : https://wellesleybooks.com/book/9781098135126
>
> Kind Regards,
> Justin Suess
>
> Justin Suess (20):
> landlock: Move operations from syscall into ruleset code
> execve: Add set_nnp_on_point_of_no_return
> landlock: Implement LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> selftests/landlock: Cover LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> landlock: Make ruleset deferred free RCU safe
> bpf: lsm: Add Landlock kfuncs
> bpf: arraymap: Implement Landlock ruleset map
> bpf: Add Landlock ruleset map type
> bpf: syscall: Handle Landlock ruleset maps
> bpf: verifier: Add Landlock ruleset map support
> selftests/bpf: Add Landlock kfunc declarations
> selftests/landlock: Rename gettid wrapper for BPF reuse
> selftests/bpf: Enable Landlock in selftests kernel.
> selftests/bpf: Add Landlock kfunc test program
> selftests/bpf: Add Landlock kfunc test runner
> landlock: Bump ABI version
> tools: bpftool: Add documentation for landlock_ruleset
> landlock: Document LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> bpf: Document BPF_MAP_TYPE_LANDLOCK_RULESET
> MAINTAINERS: update entry for the Landlock subsystem
>
> Documentation/bpf/map_landlock_ruleset.rst | 181 +++++
> Documentation/userspace-api/landlock.rst | 22 +-
> MAINTAINERS | 4 +
> fs/exec.c | 8 +
> include/linux/binfmts.h | 7 +-
> include/linux/bpf_lsm.h | 15 +
> include/linux/bpf_types.h | 1 +
> include/linux/landlock.h | 92 +++
> include/uapi/linux/bpf.h | 1 +
> include/uapi/linux/landlock.h | 14 +
> kernel/bpf/arraymap.c | 67 ++
> kernel/bpf/bpf_lsm.c | 145 ++++
> kernel/bpf/syscall.c | 4 +-
> kernel/bpf/verifier.c | 15 +-
> samples/landlock/sandboxer.c | 7 +-
> security/landlock/limits.h | 2 +-
> security/landlock/ruleset.c | 198 ++++-
> security/landlock/ruleset.h | 25 +-
> security/landlock/syscalls.c | 158 +---
> .../bpf/bpftool/Documentation/bpftool-map.rst | 2 +-
> tools/bpf/bpftool/map.c | 2 +-
> tools/include/uapi/linux/bpf.h | 1 +
> tools/lib/bpf/libbpf.c | 1 +
> tools/lib/bpf/libbpf_probes.c | 6 +
> tools/testing/selftests/bpf/bpf_kfuncs.h | 20 +
> tools/testing/selftests/bpf/config | 5 +
> tools/testing/selftests/bpf/config.x86_64 | 1 -
> .../bpf/prog_tests/landlock_kfuncs.c | 733 ++++++++++++++++++
> .../selftests/bpf/progs/landlock_kfuncs.c | 92 +++
> tools/testing/selftests/landlock/base_test.c | 10 +-
> tools/testing/selftests/landlock/common.h | 28 +-
> tools/testing/selftests/landlock/fs_test.c | 103 +--
> tools/testing/selftests/landlock/net_test.c | 55 +-
> .../testing/selftests/landlock/ptrace_test.c | 14 +-
> .../landlock/scoped_abstract_unix_test.c | 51 +-
> .../selftests/landlock/scoped_base_variants.h | 23 +
> .../selftests/landlock/scoped_common.h | 5 +-
> .../selftests/landlock/scoped_signal_test.c | 30 +-
> tools/testing/selftests/landlock/wrappers.h | 2 +-
> 39 files changed, 1877 insertions(+), 273 deletions(-)
> create mode 100644 Documentation/bpf/map_landlock_ruleset.rst
> create mode 100644 include/linux/landlock.h
> create mode 100644 tools/testing/selftests/bpf/prog_tests/landlock_kfuncs.c
> create mode 100644 tools/testing/selftests/bpf/progs/landlock_kfuncs.c
>
>
> base-commit: 8c6a27e02bc55ab110d1828610048b19f903aaec
> --
> 2.53.0
>
>
^ permalink raw reply
* Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets
From: Justin Suess @ 2026-04-08 17:10 UTC (permalink / raw)
To: mic
Cc: andrii, ast, bpf, brauner, daniel, eddyz87, fred, gnoack, jack,
jmorris, john.fastabend, kees, kpsingh, linux-fsdevel,
linux-kernel, linux-security-module, m, martin.lau, paul,
Justin Suess
In-Reply-To: <20260408.ong9Eshe0omu@digikod.net>
Add a flag LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS, which executes
task_set_no_new_privs on the current credentials, but only if
the process lacks the CAP_SYS_ADMIN capability.
While this operation is redundant for code running from userspace
(indeed callers may achieve the same logic by calling
prctl w/ PR_SET_NO_NEW_PRIVS), this flag enables callers without access
to the syscall abi (defined in subsequent patches) to restrict processes
from gaining additional capabilities. This is important to ensure that
consumers can meet the task_no_new_privs || CAP_SYS_ADMIN invariant
enforced by Landlock without having syscall access.
This is done by hooking bprm_committing_creds along with a
landlock_cred_security flag to indicate that the next execution should
task_set_no_new_privs if the process doesn't possess CAP_SYS_ADMIN. This
is done to ensure that task_set_no_new_privs is being done past the
point of no return.
Cc: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
On Wed, Apr 08, 2026 at 02:00:00 -0000, Mickaël Salaün wrote:
> > Points of Feedback
> > ===
> >
> > First, the new set_nnp_on_point_of_no_return field in struct linux_binprm.
> > This field was needed to request that task_set_no_new_privs be set during an
> > execution, but only after the execution has proceeded beyond the point of no
> > return. I couldn't find a way to express this semantic without adding a new
> > bitfield to struct linux_binprm and a conditional in fs/exec.c. Please see
> > patch 2.
> What about using security_bprm_committing_creds()?
Good idea. Definitely cleaner.
Something like this? Then dropping the "execve: Add set_nnp_on_point_of_no_return"
commit.
This adds a bitfield to the landlock_cred_security struct to indicate that the flag
should be set on the next exec(s).
include/uapi/linux/landlock.h | 14 ++++++++++++++
security/landlock/cred.c | 13 +++++++++++++
security/landlock/cred.h | 7 +++++++
security/landlock/limits.h | 2 +-
security/landlock/ruleset.c | 15 ++++++++++++---
security/landlock/syscalls.c | 5 +++++
6 files changed, 52 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index f88fa1f68b77..edd9d9a7f60e 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -129,12 +129,26 @@ struct landlock_ruleset_attr {
*
* If the calling thread is running with no_new_privs, this operation
* enables no_new_privs on the sibling threads as well.
+ *
+ * %LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
+ * Sets no_new_privs on the calling thread before applying the Landlock domain.
+ * This flag is useful for convenience as well as for applying a ruleset from
+ * an outside context (e.g BPF). This flag only has an effect on when both
+ * no_new_privs isn't already set and the caller doesn't possess CAP_SYS_ADMIN.
+ *
+ * This flag has slightly different behavior when used from BPF. Instead of
+ * setting no_new_privs on the current task, it sets a flag on the bprm so that
+ * no_new_privs is set on the task at exec point-of-no-return. This guarantees
+ * that the current execution is unaffected, and may escalate as usual until the
+ * next exec, but the resulting task cannot gain more privileges through later
+ * exec transitions.
*/
/* clang-format off */
#define LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF (1U << 0)
#define LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON (1U << 1)
#define LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF (1U << 2)
#define LANDLOCK_RESTRICT_SELF_TSYNC (1U << 3)
+#define LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS (1U << 4)
/* clang-format on */
/**
diff --git a/security/landlock/cred.c b/security/landlock/cred.c
index 0cb3edde4d18..bcc9b716916f 100644
--- a/security/landlock/cred.c
+++ b/security/landlock/cred.c
@@ -43,6 +43,18 @@ static void hook_cred_free(struct cred *const cred)
landlock_put_ruleset_deferred(dom);
}
+static void hook_bprm_committing_creds(const struct linux_binprm *bprm)
+{
+ struct landlock_cred_security *const llcred = landlock_cred(bprm->cred);
+
+ if (llcred->set_nnp_on_committing_creds &&
+ !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN)) {
+ task_set_no_new_privs(current);
+ /* Don't need to set it again for subsequent execution. */
+ llcred->set_nnp_on_committing_creds = false;
+ }
+}
+
#ifdef CONFIG_AUDIT
static int hook_bprm_creds_for_exec(struct linux_binprm *const bprm)
@@ -55,6 +67,7 @@ static int hook_bprm_creds_for_exec(struct linux_binprm *const bprm)
#endif /* CONFIG_AUDIT */
static struct security_hook_list landlock_hooks[] __ro_after_init = {
+ LSM_HOOK_INIT(bprm_committing_creds, hook_bprm_committing_creds),
LSM_HOOK_INIT(cred_prepare, hook_cred_prepare),
LSM_HOOK_INIT(cred_transfer, hook_cred_transfer),
LSM_HOOK_INIT(cred_free, hook_cred_free),
diff --git a/security/landlock/cred.h b/security/landlock/cred.h
index c10a06727eb1..7ec6dd12ebc3 100644
--- a/security/landlock/cred.h
+++ b/security/landlock/cred.h
@@ -49,6 +49,13 @@ struct landlock_cred_security {
* not require a current domain.
*/
u8 log_subdomains_off : 1;
+ /**
+ * @set_nnp_on_committing_creds: Set if the domain should set NO_NEW_PRIVS on the
+ * execution past the point of no return in security_bprm_committing_creds().
+ * This is not a hierarchy configuration because the nnp state is inherited by
+ * exec and doesn't need further configuration.
+ */
+ u8 set_nnp_on_committing_creds : 1;
#endif /* CONFIG_AUDIT */
} __packed;
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index eb584f47288d..d298086a4180 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -31,7 +31,7 @@
#define LANDLOCK_MASK_SCOPE ((LANDLOCK_LAST_SCOPE << 1) - 1)
#define LANDLOCK_NUM_SCOPE __const_hweight64(LANDLOCK_MASK_SCOPE)
-#define LANDLOCK_LAST_RESTRICT_SELF LANDLOCK_RESTRICT_SELF_TSYNC
+#define LANDLOCK_LAST_RESTRICT_SELF LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
#define LANDLOCK_MASK_RESTRICT_SELF ((LANDLOCK_LAST_RESTRICT_SELF << 1) - 1)
/* clang-format on */
diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index 1d6fa74f2a52..ad0bd5994ec5 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -121,11 +121,13 @@ int landlock_restrict_cred_precheck(const __u32 flags,
/*
* Similar checks as for seccomp(2), except that an -EPERM may be
- * returned.
+ * returned, or no_new_privs may be set by the caller via
+ * LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS.
*/
if (!task_no_new_privs(current) &&
!ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN)) {
- return -EPERM;
+ if (!(flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS))
+ return -EPERM;
}
if (flags & ~LANDLOCK_MASK_RESTRICT_SELF)
@@ -140,7 +142,7 @@ int landlock_restrict_cred(struct cred *const cred,
{
struct landlock_cred_security *new_llcred;
bool __maybe_unused log_same_exec, log_new_exec, log_subdomains,
- prev_log_subdomains;
+ prev_log_subdomains, set_nnp_on_committing_creds;
/*
* It is allowed to set LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF without
@@ -157,6 +159,12 @@ int landlock_restrict_cred(struct cred *const cred,
log_new_exec = !!(flags & LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON);
/* Translates "off" flag to boolean. */
log_subdomains = !(flags & LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF);
+ /*
+ * Translates "on" flag to boolean. This flag is not inherited by exec,
+ * but the resulting nnp state is.
+ */
+ set_nnp_on_committing_creds =
+ !!(flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS);
new_llcred = landlock_cred(cred);
@@ -165,6 +173,7 @@ int landlock_restrict_cred(struct cred *const cred,
new_llcred->log_subdomains_off = !prev_log_subdomains ||
!log_subdomains;
#endif /* CONFIG_AUDIT */
+ new_llcred->set_nnp_on_committing_creds = set_nnp_on_committing_creds;
/*
* The only case when a ruleset may not be set is if
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index c6c7be7698a2..f3520c764360 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -397,6 +397,7 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
* - %LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON
* - %LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
* - %LANDLOCK_RESTRICT_SELF_TSYNC
+ * - %LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
*
* This system call enforces a Landlock ruleset on the current thread.
* Enforcing a ruleset requires that the task has %CAP_SYS_ADMIN in its
@@ -450,6 +451,10 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
if (!new_cred)
return -ENOMEM;
+ if (flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS &&
+ !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN))
+ task_set_no_new_privs(current);
+
err = landlock_restrict_cred(new_cred, ruleset, flags);
if (err) {
abort_creds(new_cred);
--
2.53.0
^ permalink raw reply related
* Re: [PATCH 1/3] crypto: public_key: Remove check for valid hash_algo for ML-DSA keys
From: Stefan Berger @ 2026-04-08 17:25 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-integrity, linux-security-module, linux-kernel, zohar,
roberto.sassu, David Howells, Lukas Wunner, Ignat Korchagin,
keyrings, linux-crypto
In-Reply-To: <20260406165350.GD2971@sol>
On 4/6/26 12:53 PM, Eric Biggers wrote:
> On Sun, Apr 05, 2026 at 07:12:22PM -0400, Stefan Berger wrote:
>> Remove the check for the hash_algo since ML-DSA is only used in pure mode
>> and there is no relevance of a hash_algo for the input data.
>>
>> Cc: David Howells <dhowells@redhat.com>
>> Cc: Lukas Wunner <lukas@wunner.de>
>> Cc: Ignat Korchagin <ignat@linux.win>
>> Cc: keyrings@vger.kernel.org
>> Cc: linux-crypto@vger.kernel.org
>> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
>> ---
>> crypto/asymmetric_keys/public_key.c | 5 -----
>> 1 file changed, 5 deletions(-)
>>
>> diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
>> index 09a0b83d5d77..df6918a77ab8 100644
>> --- a/crypto/asymmetric_keys/public_key.c
>> +++ b/crypto/asymmetric_keys/public_key.c
>> @@ -147,11 +147,6 @@ software_key_determine_akcipher(const struct public_key *pkey,
>> strcmp(pkey->pkey_algo, "mldsa87") == 0) {
>> if (strcmp(encoding, "raw") != 0)
>> return -EINVAL;
>> - if (!hash_algo)
>> - return -EINVAL;
>> - if (strcmp(hash_algo, "none") != 0 &&
>> - strcmp(hash_algo, "sha512") != 0)
>> - return -EINVAL;
>
> Does this broaden which hash algorithms are accepted for CMS signatures
> that use ML-DSA and contain signed attributes?
Right... dropping this patch and using the "none" route now.
>
> - Eric
>
^ permalink raw reply
* [PATCH v2 1/2] integrity: Refactor asymmetric_verify for reusability
From: Stefan Berger @ 2026-04-08 17:41 UTC (permalink / raw)
To: linux-integrity, linux-security-module
Cc: linux-kernel, zohar, roberto.sassu, ebiggers, Stefan Berger
In-Reply-To: <20260408174154.139606-1-stefanb@linux.ibm.com>
Refactor asymmetric_verify for reusability. Have it call
asymmetric_verify_common with the signature verification key and the
public_key structure as parameters. sigv3 support for ML-DSA will need to
check the public key type first to decide how to do the signature
verification and therefore will have these parameters available for
calling asymmetric_verify_common.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
security/integrity/digsig_asymmetric.c | 42 +++++++++++++++++---------
1 file changed, 28 insertions(+), 14 deletions(-)
diff --git a/security/integrity/digsig_asymmetric.c b/security/integrity/digsig_asymmetric.c
index 6e68ec3becbd..e29ed73f15cd 100644
--- a/security/integrity/digsig_asymmetric.c
+++ b/security/integrity/digsig_asymmetric.c
@@ -79,18 +79,15 @@ static struct key *request_asymmetric_key(struct key *keyring, uint32_t keyid)
return key;
}
-int asymmetric_verify(struct key *keyring, const char *sig,
- int siglen, const char *data, int datalen)
+static int asymmetric_verify_common(const struct key *key,
+ const struct public_key *pk,
+ const char *sig, int siglen,
+ const char *data, int datalen)
{
- struct public_key_signature pks;
struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
- const struct public_key *pk;
- struct key *key;
+ struct public_key_signature pks;
int ret;
- if (siglen <= sizeof(*hdr))
- return -EBADMSG;
-
siglen -= sizeof(*hdr);
if (siglen != be16_to_cpu(hdr->sig_size))
@@ -99,15 +96,10 @@ int asymmetric_verify(struct key *keyring, const char *sig,
if (hdr->hash_algo >= HASH_ALGO__LAST)
return -ENOPKG;
- key = request_asymmetric_key(keyring, be32_to_cpu(hdr->keyid));
- if (IS_ERR(key))
- return PTR_ERR(key);
-
memset(&pks, 0, sizeof(pks));
pks.hash_algo = hash_algo_name[hdr->hash_algo];
- pk = asymmetric_key_public_key(key);
pks.pkey_algo = pk->pkey_algo;
if (!strcmp(pk->pkey_algo, "rsa")) {
pks.encoding = "pkcs1";
@@ -127,11 +119,33 @@ int asymmetric_verify(struct key *keyring, const char *sig,
pks.s_size = siglen;
ret = verify_signature(key, &pks);
out:
- key_put(key);
pr_debug("%s() = %d\n", __func__, ret);
return ret;
}
+int asymmetric_verify(struct key *keyring, const char *sig,
+ int siglen, const char *data, int datalen)
+{
+ struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
+ const struct public_key *pk;
+ struct key *key;
+ int ret;
+
+ if (siglen <= sizeof(*hdr))
+ return -EBADMSG;
+
+ key = request_asymmetric_key(keyring, be32_to_cpu(hdr->keyid));
+ if (IS_ERR(key))
+ return PTR_ERR(key);
+ pk = asymmetric_key_public_key(key);
+
+ ret = asymmetric_verify_common(key, pk, sig, siglen, data, datalen);
+
+ key_put(key);
+
+ return ret;
+}
+
/*
* calc_file_id_hash - calculate the hash of the ima_file_id struct data
* @type: xattr type [enum evm_ima_xattr_type]
--
2.53.0
^ permalink raw reply related
* [PATCH v2 2/2] integrity: Add support for sigv3 verification using ML-DSA keys
From: Stefan Berger @ 2026-04-08 17:41 UTC (permalink / raw)
To: linux-integrity, linux-security-module
Cc: linux-kernel, zohar, roberto.sassu, ebiggers, Stefan Berger
In-Reply-To: <20260408174154.139606-1-stefanb@linux.ibm.com>
Add support for sigv3 signature verification using ML-DSA in pure mode.
When a sigv3 signature is verified, first check whether the key to use
for verification is an ML-DSA key and therefore uses a hashless signature
verification scheme. The hashless signature verification method uses the
ima_file_id structure directly for signature verification rather than
its digest.
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
v2: Set hash_algo in public_key_signature to "none"
---
security/integrity/digsig_asymmetric.c | 84 ++++++++++++++++++++++++--
1 file changed, 79 insertions(+), 5 deletions(-)
diff --git a/security/integrity/digsig_asymmetric.c b/security/integrity/digsig_asymmetric.c
index e29ed73f15cd..c80cb2b117a6 100644
--- a/security/integrity/digsig_asymmetric.c
+++ b/security/integrity/digsig_asymmetric.c
@@ -190,17 +190,91 @@ static int calc_file_id_hash(enum evm_ima_xattr_type type,
return rc;
}
+/*
+ * asymmetric_verify_v3_hashless - Use hashless signature verification on sigv3
+ * @key: The key to use for signature verification
+ * @pk: The associated public key
+ * @encoding: The encoding the key type uses
+ * @sig: The signature
+ * @siglen: The length of the xattr signature
+ * @algo: The hash algorithm
+ * @digest: The file digest
+ *
+ * Create an ima_file_id structure and use it for signature verification
+ * directly. This can be used for ML-DSA in pure mode for example.
+ */
+static int asymmetric_verify_v3_hashless(struct key *key,
+ const struct public_key *pk,
+ const char *encoding,
+ const char *sig, int siglen,
+ u8 algo,
+ const u8 *digest)
+{
+ struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
+ struct ima_file_id file_id = {
+ .hash_type = hdr->type,
+ .hash_algorithm = algo,
+ };
+ size_t digest_size = hash_digest_size[algo];
+ struct public_key_signature pks = {
+ .m = (u8 *)&file_id,
+ .m_size = sizeof(file_id) - (HASH_MAX_DIGESTSIZE - digest_size),
+ .s = hdr->sig,
+ .s_size = siglen - sizeof(*hdr),
+ .pkey_algo = pk->pkey_algo,
+ .hash_algo = "none",
+ .encoding = encoding,
+ };
+ int ret;
+
+ if (hdr->type != IMA_VERITY_DIGSIG &&
+ hdr->type != EVM_IMA_XATTR_DIGSIG &&
+ hdr->type != EVM_XATTR_PORTABLE_DIGSIG)
+ return -EINVAL;
+
+ if (pks.s_size != be16_to_cpu(hdr->sig_size))
+ return -EBADMSG;
+
+ memcpy(file_id.hash, digest, digest_size);
+
+ ret = verify_signature(key, &pks);
+ pr_debug("%s() = %d\n", __func__, ret);
+ return ret;
+}
+
int asymmetric_verify_v3(struct key *keyring, const char *sig, int siglen,
const char *data, int datalen, u8 algo)
{
struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
struct ima_max_digest_data hash;
+ const struct public_key *pk;
+ struct key *key;
int rc;
- rc = calc_file_id_hash(hdr->type, algo, data, &hash);
- if (rc)
- return -EINVAL;
+ if (siglen <= sizeof(*hdr))
+ return -EBADMSG;
+
+ key = request_asymmetric_key(keyring, be32_to_cpu(hdr->keyid));
+ if (IS_ERR(key))
+ return PTR_ERR(key);
- return asymmetric_verify(keyring, sig, siglen, hash.digest,
- hash.hdr.length);
+ pk = asymmetric_key_public_key(key);
+ if (!strncmp(pk->pkey_algo, "mldsa", 5)) {
+ rc = asymmetric_verify_v3_hashless(key, pk, "raw",
+ sig, siglen, algo, data);
+ } else {
+ rc = calc_file_id_hash(hdr->type, algo, data, &hash);
+ if (rc) {
+ rc = -EINVAL;
+ goto err_exit;
+ }
+
+ rc = asymmetric_verify_common(key, pk, sig, siglen, hash.digest,
+ hash.hdr.length);
+ }
+
+err_exit:
+ key_put(key);
+
+ return rc;
}
--
2.53.0
^ permalink raw reply related
* [PATCH v2 0/2] Add support for ML-DSA signature for EVM and IMA
From: Stefan Berger @ 2026-04-08 17:41 UTC (permalink / raw)
To: linux-integrity, linux-security-module
Cc: linux-kernel, zohar, roberto.sassu, ebiggers, Stefan Berger
Based on IMA sigv3 type of signatures, add support for ML-DSA signature
for EVM and IMA. Use the existing ML-DSA hashless signing mode (pure mode).
Stefan
v2:
- Dropped 1/3
- Using "none" as hash_algo in 2/2
Stefan Berger (2):
integrity: Refactor asymmetric_verify for reusability
integrity: Add support for sigv3 verification using ML-DSA keys
security/integrity/digsig_asymmetric.c | 126 +++++++++++++++++++++----
1 file changed, 107 insertions(+), 19 deletions(-)
base-commit: 82bbd447199ff1441031d2eaf9afe041550cf525
--
2.53.0
^ permalink raw reply
* Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets
From: Mickaël Salaün @ 2026-04-08 19:21 UTC (permalink / raw)
To: Justin Suess
Cc: andrii, ast, bpf, brauner, daniel, eddyz87, fred, gnoack, jack,
jmorris, john.fastabend, kees, kpsingh, linux-fsdevel,
linux-kernel, linux-security-module, m, martin.lau, paul
In-Reply-To: <20260408171030.4083129-1-utilityemal77@gmail.com>
On Wed, Apr 08, 2026 at 01:10:28PM -0400, Justin Suess wrote:
>
> Add a flag LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS, which executes
> task_set_no_new_privs on the current credentials, but only if
> the process lacks the CAP_SYS_ADMIN capability.
>
> While this operation is redundant for code running from userspace
> (indeed callers may achieve the same logic by calling
> prctl w/ PR_SET_NO_NEW_PRIVS), this flag enables callers without access
> to the syscall abi (defined in subsequent patches) to restrict processes
> from gaining additional capabilities. This is important to ensure that
> consumers can meet the task_no_new_privs || CAP_SYS_ADMIN invariant
> enforced by Landlock without having syscall access.
>
> This is done by hooking bprm_committing_creds along with a
> landlock_cred_security flag to indicate that the next execution should
> task_set_no_new_privs if the process doesn't possess CAP_SYS_ADMIN. This
> is done to ensure that task_set_no_new_privs is being done past the
> point of no return.
>
> Cc: Mickaël Salaün <mic@digikod.net>
> Signed-off-by: Justin Suess <utilityemal77@gmail.com>
> ---
>
> On Wed, Apr 08, 2026 at 02:00:00 -0000, Mickaël Salaün wrote:
> > > Points of Feedback
> > > ===
> > >
> > > First, the new set_nnp_on_point_of_no_return field in struct linux_binprm.
> > > This field was needed to request that task_set_no_new_privs be set during an
> > > execution, but only after the execution has proceeded beyond the point of no
> > > return. I couldn't find a way to express this semantic without adding a new
> > > bitfield to struct linux_binprm and a conditional in fs/exec.c. Please see
> > > patch 2.
>
> > What about using security_bprm_committing_creds()?
>
> Good idea. Definitely cleaner.
>
> Something like this? Then dropping the "execve: Add set_nnp_on_point_of_no_return"
> commit.
>
> This adds a bitfield to the landlock_cred_security struct to indicate that the flag
> should be set on the next exec(s).
>
> include/uapi/linux/landlock.h | 14 ++++++++++++++
> security/landlock/cred.c | 13 +++++++++++++
> security/landlock/cred.h | 7 +++++++
> security/landlock/limits.h | 2 +-
> security/landlock/ruleset.c | 15 ++++++++++++---
> security/landlock/syscalls.c | 5 +++++
> 6 files changed, 52 insertions(+), 4 deletions(-)
>
> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> index f88fa1f68b77..edd9d9a7f60e 100644
> --- a/include/uapi/linux/landlock.h
> +++ b/include/uapi/linux/landlock.h
> @@ -129,12 +129,26 @@ struct landlock_ruleset_attr {
> *
> * If the calling thread is running with no_new_privs, this operation
> * enables no_new_privs on the sibling threads as well.
> + *
> + * %LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> + * Sets no_new_privs on the calling thread before applying the Landlock domain.
> + * This flag is useful for convenience as well as for applying a ruleset from
> + * an outside context (e.g BPF). This flag only has an effect on when both
> + * no_new_privs isn't already set and the caller doesn't possess CAP_SYS_ADMIN.
> + *
> + * This flag has slightly different behavior when used from BPF. Instead of
> + * setting no_new_privs on the current task, it sets a flag on the bprm so that
> + * no_new_privs is set on the task at exec point-of-no-return. This guarantees
> + * that the current execution is unaffected, and may escalate as usual until the
> + * next exec, but the resulting task cannot gain more privileges through later
> + * exec transitions.
> */
> /* clang-format off */
> #define LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF (1U << 0)
> #define LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON (1U << 1)
> #define LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF (1U << 2)
> #define LANDLOCK_RESTRICT_SELF_TSYNC (1U << 3)
> +#define LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS (1U << 4)
> /* clang-format on */
>
> /**
> diff --git a/security/landlock/cred.c b/security/landlock/cred.c
> index 0cb3edde4d18..bcc9b716916f 100644
> --- a/security/landlock/cred.c
> +++ b/security/landlock/cred.c
> @@ -43,6 +43,18 @@ static void hook_cred_free(struct cred *const cred)
> landlock_put_ruleset_deferred(dom);
> }
>
> +static void hook_bprm_committing_creds(const struct linux_binprm *bprm)
> +{
> + struct landlock_cred_security *const llcred = landlock_cred(bprm->cred);
> +
> + if (llcred->set_nnp_on_committing_creds &&
> + !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN)) {
If asked by the caller, NNP must be set, whatever the capabilities of
the task.
> + task_set_no_new_privs(current);
> + /* Don't need to set it again for subsequent execution. */
> + llcred->set_nnp_on_committing_creds = false;
> + }
Thinking more about it, it would make more sense to add another flag to
enforce restriction on the next exec. This new cred bit would then be
generic and enforce both NNP (if set) and the domain once we know the
execution is ok. That should also bring the required plumbing to
create the domain at syscall (or kfunc) time and handle memory
allocation issue there, but only enforce it at exec time with
security_bprm_committing_creds() (without any possible error).
> +}
> +
> #ifdef CONFIG_AUDIT
>
> static int hook_bprm_creds_for_exec(struct linux_binprm *const bprm)
> @@ -55,6 +67,7 @@ static int hook_bprm_creds_for_exec(struct linux_binprm *const bprm)
> #endif /* CONFIG_AUDIT */
>
> static struct security_hook_list landlock_hooks[] __ro_after_init = {
> + LSM_HOOK_INIT(bprm_committing_creds, hook_bprm_committing_creds),
> LSM_HOOK_INIT(cred_prepare, hook_cred_prepare),
> LSM_HOOK_INIT(cred_transfer, hook_cred_transfer),
> LSM_HOOK_INIT(cred_free, hook_cred_free),
> diff --git a/security/landlock/cred.h b/security/landlock/cred.h
> index c10a06727eb1..7ec6dd12ebc3 100644
> --- a/security/landlock/cred.h
> +++ b/security/landlock/cred.h
> @@ -49,6 +49,13 @@ struct landlock_cred_security {
> * not require a current domain.
> */
> u8 log_subdomains_off : 1;
> + /**
> + * @set_nnp_on_committing_creds: Set if the domain should set NO_NEW_PRIVS on the
> + * execution past the point of no return in security_bprm_committing_creds().
> + * This is not a hierarchy configuration because the nnp state is inherited by
> + * exec and doesn't need further configuration.
> + */
> + u8 set_nnp_on_committing_creds : 1;
> #endif /* CONFIG_AUDIT */
> } __packed;
>
> diff --git a/security/landlock/limits.h b/security/landlock/limits.h
> index eb584f47288d..d298086a4180 100644
> --- a/security/landlock/limits.h
> +++ b/security/landlock/limits.h
> @@ -31,7 +31,7 @@
> #define LANDLOCK_MASK_SCOPE ((LANDLOCK_LAST_SCOPE << 1) - 1)
> #define LANDLOCK_NUM_SCOPE __const_hweight64(LANDLOCK_MASK_SCOPE)
>
> -#define LANDLOCK_LAST_RESTRICT_SELF LANDLOCK_RESTRICT_SELF_TSYNC
> +#define LANDLOCK_LAST_RESTRICT_SELF LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> #define LANDLOCK_MASK_RESTRICT_SELF ((LANDLOCK_LAST_RESTRICT_SELF << 1) - 1)
>
> /* clang-format on */
> diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
> index 1d6fa74f2a52..ad0bd5994ec5 100644
> --- a/security/landlock/ruleset.c
> +++ b/security/landlock/ruleset.c
> @@ -121,11 +121,13 @@ int landlock_restrict_cred_precheck(const __u32 flags,
>
> /*
> * Similar checks as for seccomp(2), except that an -EPERM may be
> - * returned.
> + * returned, or no_new_privs may be set by the caller via
> + * LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS.
> */
> if (!task_no_new_privs(current) &&
> !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN)) {
> - return -EPERM;
> + if (!(flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS))
> + return -EPERM;
> }
>
> if (flags & ~LANDLOCK_MASK_RESTRICT_SELF)
> @@ -140,7 +142,7 @@ int landlock_restrict_cred(struct cred *const cred,
> {
> struct landlock_cred_security *new_llcred;
> bool __maybe_unused log_same_exec, log_new_exec, log_subdomains,
> - prev_log_subdomains;
> + prev_log_subdomains, set_nnp_on_committing_creds;
>
> /*
> * It is allowed to set LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF without
> @@ -157,6 +159,12 @@ int landlock_restrict_cred(struct cred *const cred,
> log_new_exec = !!(flags & LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON);
> /* Translates "off" flag to boolean. */
> log_subdomains = !(flags & LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF);
> + /*
> + * Translates "on" flag to boolean. This flag is not inherited by exec,
> + * but the resulting nnp state is.
> + */
> + set_nnp_on_committing_creds =
> + !!(flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS);
>
> new_llcred = landlock_cred(cred);
>
> @@ -165,6 +173,7 @@ int landlock_restrict_cred(struct cred *const cred,
> new_llcred->log_subdomains_off = !prev_log_subdomains ||
> !log_subdomains;
> #endif /* CONFIG_AUDIT */
> + new_llcred->set_nnp_on_committing_creds = set_nnp_on_committing_creds;
>
> /*
> * The only case when a ruleset may not be set is if
> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
> index c6c7be7698a2..f3520c764360 100644
> --- a/security/landlock/syscalls.c
> +++ b/security/landlock/syscalls.c
> @@ -397,6 +397,7 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
> * - %LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON
> * - %LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
> * - %LANDLOCK_RESTRICT_SELF_TSYNC
> + * - %LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> *
> * This system call enforces a Landlock ruleset on the current thread.
> * Enforcing a ruleset requires that the task has %CAP_SYS_ADMIN in its
> @@ -450,6 +451,10 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
> if (!new_cred)
> return -ENOMEM;
>
> + if (flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS &&
> + !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN))
> + task_set_no_new_privs(current);
> +
> err = landlock_restrict_cred(new_cred, ruleset, flags);
> if (err) {
> abort_creds(new_cred);
> --
> 2.53.0
>
>
^ permalink raw reply
* Re: [PATCH v2] KEYS: trusted: Debugging as a feature
From: Nayna Jain @ 2026-04-09 0:41 UTC (permalink / raw)
To: Jarkko Sakkinen
Cc: linux-integrity, keyrings, Srish Srinivasan, James Bottomley,
Mimi Zohar, David Howells, Paul Moore, James Morris,
Serge E. Hallyn, Ahmad Fatoum, Pengutronix Kernel Team, open list,
open list:SECURITY SUBSYSTEM
In-Reply-To: <adYRURAJfNCu0FYB@kernel.org>
On 4/8/26 4:26 AM, Jarkko Sakkinen wrote:
> On Mon, Apr 06, 2026 at 10:42:00PM -0400, Nayna Jain wrote:
>> On 3/24/26 7:00 AM, Jarkko Sakkinen wrote:
>>> TPM_DEBUG, and other similar flags, are a non-standard way to specify a
>>> feature in Linux kernel. Introduce CONFIG_TRUSTED_KEYS_DEBUG for
>>> trusted keys, and use it to replace these ad-hoc feature flags.
>>>
>>> Given that trusted keys debug dumps can contain sensitive data, harden
>>> the feature as follows:
>>>
>>> 1. In the Kconfig description postulate that pr_debug() statements must be
>>> used.
>>> 2. Use pr_debug() statements in TPM 1.x driver to print the protocol dump.
>>>
>>> Traces, when actually needed, can be easily enabled by providing
>>> trusted.dyndbg='+p' in the kernel command-line.
>>>
>>> Cc: Srish Srinivasan <ssrish@linux.ibm.com>
>>> Reported-by: Nayna Jain <nayna@linux.ibm.com>
>>> Closes: https://lore.kernel.org/all/7f8b8478-5cd8-4d97-bfd0-341fd5cf10f9@linux.ibm.com/
>>> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
>>> ---
>>> v2:
>>> - Implement for all trusted keys backends.
>>> - Add HAVE_TRUSTED_KEYS_DEBUG as it is a good practice despite full
>>> coverage.
>>> ---
>>> include/keys/trusted-type.h | 18 +++++-------
>>> security/keys/trusted-keys/Kconfig | 19 ++++++++++++
>>> security/keys/trusted-keys/trusted_caam.c | 4 +--
>>> security/keys/trusted-keys/trusted_tpm1.c | 36 +++++++++++------------
>>> 4 files changed, 46 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/include/keys/trusted-type.h b/include/keys/trusted-type.h
>>> index 03527162613f..620a1f890b6b 100644
>>> --- a/include/keys/trusted-type.h
>>> +++ b/include/keys/trusted-type.h
>>> @@ -83,18 +83,16 @@ struct trusted_key_source {
>>> extern struct key_type key_type_trusted;
>>> -#define TRUSTED_DEBUG 0
>>> -
>>> -#if TRUSTED_DEBUG
>>> +#ifdef CONFIG_TRUSTED_KEYS_DEBUG
>>> static inline void dump_payload(struct trusted_key_payload *p)
>>> {
>>> - pr_info("key_len %d\n", p->key_len);
>>> - print_hex_dump(KERN_INFO, "key ", DUMP_PREFIX_NONE,
>>> - 16, 1, p->key, p->key_len, 0);
>>> - pr_info("bloblen %d\n", p->blob_len);
>>> - print_hex_dump(KERN_INFO, "blob ", DUMP_PREFIX_NONE,
>>> - 16, 1, p->blob, p->blob_len, 0);
>>> - pr_info("migratable %d\n", p->migratable);
>>> + pr_debug("key_len %d\n", p->key_len);
>>> + print_hex_dump_debug("key ", DUMP_PREFIX_NONE,
>>> + 16, 1, p->key, p->key_len, 0);
>>> + pr_debug("bloblen %d\n", p->blob_len);
>>> + print_hex_dump_debug("blob ", DUMP_PREFIX_NONE,
>>> + 16, 1, p->blob, p->blob_len, 0);
>>> + pr_debug("migratable %d\n", p->migratable);
>>> }
>>> #else
>>> static inline void dump_payload(struct trusted_key_payload *p)
>>> diff --git a/security/keys/trusted-keys/Kconfig b/security/keys/trusted-keys/Kconfig
>>> index 9e00482d886a..2ad9ba0e03f1 100644
>>> --- a/security/keys/trusted-keys/Kconfig
>>> +++ b/security/keys/trusted-keys/Kconfig
>>> @@ -1,10 +1,25 @@
>>> config HAVE_TRUSTED_KEYS
>>> bool
>>> +config HAVE_TRUSTED_KEYS_DEBUG
>>> + bool
>>> +
>>> +config TRUSTED_KEYS_DEBUG
>>> + bool "Debug trusted keys"
>>> + depends on HAVE_TRUSTED_KEYS_DEBUG
>>> + default n
>>> + help
>>> + Trusted keys backends and core code that support debug dumps
>>> + can opt-in that feature here. Dumps must only use DEBUG
>>> + level output, as sensitive data may pass by. In the
>>> + kernel-command line traces can be enabled via
>>> + trusted.dyndbg='+p'.
>> Would it be good idea to add an explicit note/warning:
>>
>>
>> NOTE: This option is intended for debugging purposes only. Do not enable on
>> production systems as debug output may expose sensitive cryptographic
>> material.
>> If you are unsure, say N.
>>
>> Apart from this, looks good to me.
>>
>> Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
> Thank, I'll add your tag but would you mind quickly screening v3 again
> where I add "trusted.debug=0|1". And yes, your suggestion about extra
> warning makes sense.
Sure Jarkko.. However, I don't see v3 version in my inbox or in
linux-integrity. Or you are about to post it soon.
>
> Let's make this safe as possible. Mistakes do happen... and then those
> measures pay off :-)
Yes agree.
>
> BR, Jarkko
Thanks & Regards,
- Nayna
^ permalink raw reply
* Re: [PATCH v4 3/3] selinux: fix overlayfs mmap() and mprotect() access checks
From: Ondrej Mosnacek @ 2026-04-09 9:16 UTC (permalink / raw)
To: Paul Moore
Cc: Stephen Smalley, linux-security-module, selinux, linux-fsdevel,
linux-unionfs, linux-erofs, Amir Goldstein, Gao Xiang,
Christian Brauner
In-Reply-To: <CAHC9VhQ5PH99EQBuYq4c7Jf82UXiDfC7qzM2kvnZuyH6yFPL_Q@mail.gmail.com>
On Tue, Apr 7, 2026 at 10:21 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Tue, Apr 7, 2026 at 3:20 PM Stephen Smalley
> <stephen.smalley.work@gmail.com> wrote:
> > On Tue, Apr 7, 2026 at 10:35 AM Paul Moore <paul@paul-moore.com> wrote:
> > > On Tue, Apr 7, 2026 at 8:14 AM Stephen Smalley
> > > <stephen.smalley.work@gmail.com> wrote:
> > > > On Thu, Apr 2, 2026 at 11:09 PM Paul Moore <paul@paul-moore.com> wrote:
> > > > >
> > > > > The existing SELinux security model for overlayfs is to allow access if
> > > > > the current task is able to access the top level file (the "user" file)
> > > > > and the mounter's credentials are sufficient to access the lower
> > > > > level file (the "backing" file). Unfortunately, the current code does
> > > > > not properly enforce these access controls for both mmap() and mprotect()
> > > > > operations on overlayfs filesystems.
> > > > >
> > > > > This patch makes use of the newly created security_mmap_backing_file()
> > > > > LSM hook to provide the missing backing file enforcement for mmap()
> > > > > operations, and leverages the backing file API and new LSM blob to
> > > > > provide the necessary information to properly enforce the mprotect()
> > > > > access controls.
> > > > >
> > > > > Cc: stable@vger.kernel.org
> > > > > Signed-off-by: Paul Moore <paul@paul-moore.com>
> > > >
> > > > Do you have tests for these changes showing the before and after (i.e.
> > > > failing without your patches, passing with them)? I tried running an
> > > > earlier set from Ondrej but they failed.
> > >
> > > A few months ago I sent you and Ondrej some feedback on those early
> > > tests from Ondrej, but yes, I also had problems with Ondrej's tests.
> > > I've been using a hacked up combination of the existing tests, some of
> > > Ondrej's additions, and an additional debug/test patch to ensure the
> > > labeling is correct. It's far from ideal, but I didn't invest time in
> > > test development as I assumed Ondrej would continue his efforts there
> > > (unfortunately it doesn't appear that he has?), and I wanted to focus
> > > on getting a solution as soon as possible for obvious reasons.
> >
> > Ok, I'm happy to look at even unpolished tests - just want something I
> > can use to exercise the before and after states.
>
> Hopefully Ondrej can provide an updated patch.
Sorry for the radio silence... I just posted the fixed patch to the list.
I also pushed a more targeted standalone TMT/beakerlib test here,
which also tests the dynamic transition situation:
https://src.fedoraproject.org/fork/omos/tests/selinux/blob/overlayfs-mmap-bugs/f/kernel/overlayfs-mmap-bugs
To run it on Fedora, it should be enough to `dnf install -y beakerlib
selinux-policy-devel gcc` and run the runtest.sh script directly.
--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.
^ permalink raw reply
* Re: [PATCH v2 0/4] Firmware LSM hook
From: Roberto Sassu @ 2026-04-09 12:27 UTC (permalink / raw)
To: Leon Romanovsky, KP Singh, Matt Bobrowski, Alexei Starovoitov,
Daniel Borkmann, John Fastabend, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
Jason Gunthorpe, Saeed Mahameed, Itay Avraham, Dave Jiang,
Jonathan Cameron
Cc: bpf, linux-kernel, linux-kselftest, linux-rdma, Chiara Meiohas,
Maher Sanalla, paul, linux-security-module
In-Reply-To: <20260409121230.GA720371@unreal>
On Thu, 2026-04-09 at 15:12 +0300, Leon Romanovsky wrote:
> On Tue, Mar 31, 2026 at 08:56:32AM +0300, Leon Romanovsky wrote:
> > From Chiara:
> >
> > This patch set introduces a new BPF LSM hook to validate firmware commands
> > triggered by userspace before they are submitted to the device. The hook
> > runs after the command buffer is constructed, right before it is sent
> > to firmware.
>
> <...>
>
> > ---
> > Chiara Meiohas (4):
> > bpf: add firmware command validation hook
> > selftests/bpf: add test cases for fw_validate_cmd hook
> > RDMA/mlx5: Externally validate FW commands supplied in DEVX interface
> > fwctl/mlx5: Externally validate FW commands supplied in fwctl
>
> Hi,
>
> Can we get Ack from BPF/LSM side?
+ Paul, linux-security-module ML
Hi
probably you also want to get an Ack from the LSM maintainer (added in
CC with the list). Most likely, he will also ask you to create the
security_*() functions counterparts of the BPF hooks.
Roberto
> Thanks
>
> >
> > drivers/fwctl/mlx5/main.c | 12 +++++-
> > drivers/infiniband/hw/mlx5/devx.c | 49 ++++++++++++++++++------
> > include/linux/bpf_lsm.h | 41 ++++++++++++++++++++
> > kernel/bpf/bpf_lsm.c | 11 ++++++
> > tools/testing/selftests/bpf/progs/verifier_lsm.c | 23 +++++++++++
> > 5 files changed, 122 insertions(+), 14 deletions(-)
> > ---
> > base-commit: 11439c4635edd669ae435eec308f4ab8a0804808
> > change-id: 20260309-fw-lsm-hook-7c094f909ffc
> >
> > Best regards,
> > --
> > Leon Romanovsky <leonro@nvidia.com>
> >
^ permalink raw reply
* Re: [PATCH v2 0/4] Firmware LSM hook
From: Leon Romanovsky @ 2026-04-09 12:45 UTC (permalink / raw)
To: Roberto Sassu
Cc: KP Singh, Matt Bobrowski, Alexei Starovoitov, Daniel Borkmann,
John Fastabend, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, Stanislav Fomichev,
Hao Luo, Jiri Olsa, Shuah Khan, Jason Gunthorpe, Saeed Mahameed,
Itay Avraham, Dave Jiang, Jonathan Cameron, bpf, linux-kernel,
linux-kselftest, linux-rdma, Chiara Meiohas, Maher Sanalla, paul,
linux-security-module
In-Reply-To: <2dd138a2ae87f90c55dbc3178d9c798294fd4450.camel@huaweicloud.com>
On Thu, Apr 09, 2026 at 02:27:43PM +0200, Roberto Sassu wrote:
> On Thu, 2026-04-09 at 15:12 +0300, Leon Romanovsky wrote:
> > On Tue, Mar 31, 2026 at 08:56:32AM +0300, Leon Romanovsky wrote:
> > > From Chiara:
> > >
> > > This patch set introduces a new BPF LSM hook to validate firmware commands
> > > triggered by userspace before they are submitted to the device. The hook
> > > runs after the command buffer is constructed, right before it is sent
> > > to firmware.
> >
> > <...>
> >
> > > ---
> > > Chiara Meiohas (4):
> > > bpf: add firmware command validation hook
> > > selftests/bpf: add test cases for fw_validate_cmd hook
> > > RDMA/mlx5: Externally validate FW commands supplied in DEVX interface
> > > fwctl/mlx5: Externally validate FW commands supplied in fwctl
> >
> > Hi,
> >
> > Can we get Ack from BPF/LSM side?
>
> + Paul, linux-security-module ML
>
> Hi
>
> probably you also want to get an Ack from the LSM maintainer (added in
> CC with the list). Most likely, he will also ask you to create the
> security_*() functions counterparts of the BPF hooks.
We implemented this approach in v1:
https://patch.msgid.link/20260309-fw-lsm-hook-v1-0-4a6422e63725@nvidia.com
and were advised to pursue a different direction.
Thanks
>
> Roberto
>
> > Thanks
> >
> > >
> > > drivers/fwctl/mlx5/main.c | 12 +++++-
> > > drivers/infiniband/hw/mlx5/devx.c | 49 ++++++++++++++++++------
> > > include/linux/bpf_lsm.h | 41 ++++++++++++++++++++
> > > kernel/bpf/bpf_lsm.c | 11 ++++++
> > > tools/testing/selftests/bpf/progs/verifier_lsm.c | 23 +++++++++++
> > > 5 files changed, 122 insertions(+), 14 deletions(-)
> > > ---
> > > base-commit: 11439c4635edd669ae435eec308f4ab8a0804808
> > > change-id: 20260309-fw-lsm-hook-7c094f909ffc
> > >
> > > Best regards,
> > > --
> > > Leon Romanovsky <leonro@nvidia.com>
> > >
>
>
^ permalink raw reply
* Re: LSM: Whiteout chardev creation sidesteps mknod hook
From: Christian Brauner @ 2026-04-09 12:47 UTC (permalink / raw)
To: Serge Hallyn, Miklos Szeredi, Amir Goldstein
Cc: Günther Noack, Mickaël Salaün, Paul Moore,
linux-security-module
In-Reply-To: <06337e89-349a-4334-a735-b8dc9b566cdd@hallyn.com>
On Tue, Apr 07, 2026 at 12:15:00PM -0500, Serge Hallyn wrote:
> Apr 7, 2026 08:05:43 Günther Noack <gnoack@google.com>:
>
> > Hello Christian, Paul, Mickaël and LSM maintainers!
> >
> > I discovered the following bug in Landlock, which potentially also
> > affects other LSMs:
> >
> > With renameat2(2)'s RENAME_WHITEOUT flag, it is possible to create a
> > "whiteout object" at the source of the rename. Whiteout objects are
> > character devices with major/minor (0, 0) -- these devices are not
> > bound to any driver, so they are harmless, but still, the creation of
> > these files can sidestep the LANDLOCK_ACCESS_FS_MAKE_CHAR access right
> > in Landlock.
They aren't devices.
> >
> >
> > I am unconvinced which is the right fix here -- do you have an opinion
> > on this from the VFS/LSM side?
> >
> >
> > Option 1: Make filesystems call security_path_mknod() during RENAME_WHITEOUT?
No.
> >
> > Do it in the VFS rename hook.
> >
> > * Pro: Fixes it for all LSMs
> > * Con: Call would have to be done in multiple filesystems
> >
> >
> > Option 2: Handle it in security_{path,inode}_rename()
> >
> > Make Landlock handle it in security_inode_rename() by looking for the
> > RENAME_WHITEOUT flag.
> >
> > * Con: Operation should only be denied if the file system even
> > implements RENAME_WHITEOUT, and we would have to maintain a list of
Why? Just deny RENAME_WHITEOUT. What does it matter if the filesystem
implements it or not. Overlayfs would fall back to non-RENAME_WHITEOUT
if not provided by the upper fs anway.
> > affected filesystems for that. (That feels like solving it at the
> > wrong layer of abstraction.)
> > * Con: Unclear whether other LSMs need a similar fix
> >
> >
> > Option 3: Declare that this is working as intended?
>
> Option 3 has my vote.
Seconded.
>
>
> > * Pro: (0, 0) is not a "real" character device
> >
> >
> > In cases 1 and 2, we'd likely need to double check that we are not
> > breaking existing scenarios involving OverlayFS, by suddenly requiring
> > a more lax policy for creating character devices on these directories.
> >
> > Please let me know what you think. I'm specifically interested in:
> >
> > 1. Christian: What is the appropriate way to do this VFS wise?
> > 2. LSM maintainers: Is this a bug that affects other LSMs as well?
> >
> > Thanks,
> > —Günther
> >
> > P.S.: For full transparency, I found this bug by pointing Google
> > Gemini at the Landlock codebase.
Clearly.
^ permalink raw reply
* Re: [RFC PATCH v1 02/11] security: Add LSM_AUDIT_DATA_NS for namespace audit records
From: Christian Brauner @ 2026-04-09 13:29 UTC (permalink / raw)
To: Mickaël Salaün
Cc: Günther Noack, Paul Moore, Serge E . Hallyn, Justin Suess,
Lennart Poettering, Mikhail Ivanov, Nicolas Bouchinet,
Shervin Oloumi, Tingmao Wang, kernel-team, linux-fsdevel,
linux-kernel, linux-security-module
In-Reply-To: <20260401.AhPieg6heeth@digikod.net>
On Wed, Apr 01, 2026 at 08:48:23PM +0200, Mickaël Salaün wrote:
> On Wed, Apr 01, 2026 at 06:38:34PM +0200, Mickaël Salaün wrote:
> > On Wed, Mar 25, 2026 at 01:32:42PM +0100, Christian Brauner wrote:
> > > On Thu, Mar 12, 2026 at 11:04:35AM +0100, Mickaël Salaün wrote:
> > > > Add a new LSM audit data type LSM_AUDIT_DATA_NS that logs namespace
> > > > information in audit records. Two fields are provided, matching the
> > > > field names of struct ns_common:
> > > >
> > > > - ns_type: the CLONE_NEW* flag identifying the namespace type, logged in
> > > > hexadecimal.
> > > >
> > > > - inum: the proc inode number identifying a specific namespace instance.
> > > > Namespace inode numbers are allocated by proc_alloc_inum() via
> > > > ida_alloc_max() bounded to UINT_MAX, so the value always fits in 32
> > > > bits.
> > > >
> > > > A new audit data type is needed because no existing LSM_AUDIT_DATA_*
> > > > type carries namespace information. The closest alternatives (e.g.
> > > > LSM_AUDIT_DATA_TASK or LSM_AUDIT_DATA_NONE with custom strings) would
> > > > either lose the namespace type or require ad-hoc formatting that
> > > > bypasses the structured audit data union.
> > > >
> > > > Cc: Christian Brauner <brauner@kernel.org>
> > > > Cc: Günther Noack <gnoack@google.com>
> > > > Cc: Paul Moore <paul@paul-moore.com>
> > > > Signed-off-by: Mickaël Salaün <mic@digikod.net>
> > > > ---
> > > > include/linux/lsm_audit.h | 5 +++++
> > > > security/lsm_audit.c | 4 ++++
> > > > 2 files changed, 9 insertions(+)
> > > >
> > > > diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
> > > > index 382c56a97bba..6e20a56b8c22 100644
> > > > --- a/include/linux/lsm_audit.h
> > > > +++ b/include/linux/lsm_audit.h
> > > > @@ -78,6 +78,7 @@ struct common_audit_data {
> > > > #define LSM_AUDIT_DATA_NOTIFICATION 16
> > > > #define LSM_AUDIT_DATA_ANONINODE 17
> > > > #define LSM_AUDIT_DATA_NLMSGTYPE 18
> > > > +#define LSM_AUDIT_DATA_NS 19
> > > > union {
> > > > struct path path;
> > > > struct dentry *dentry;
> > > > @@ -100,6 +101,10 @@ struct common_audit_data {
> > > > int reason;
> > > > const char *anonclass;
> > > > u16 nlmsg_type;
> > > > + struct {
> > > > + u32 ns_type;
> > > > + unsigned int inum;
> > >
> > > fwiw, you might want to start the 64-bit namespace id as well.
> > > But either way:
> >
> > Right now these numbers are generated by ida_alloc_max(), which return
> > an int. Is there an ongoing patch series for this change?
>
> OK, we should not use the inode's number (32-bit) but the namespace ID
> (64-bit) which is readable with the NS_GET_ID IOCTL on the namespace
> FDs. I'll use that with ns_id instead of inum. I'll also update the
> Landlock code and tests accordingly.
Yes, it's embedded in struct ns_common.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox