Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

From: "Mickaël Salaün" <mic@digikod.net>
To: Justin Suess <utilityemal77@gmail.com>
Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	 kpsingh@kernel.org, paul@paul-moore.com,
	viro@zeniv.linux.org.uk, brauner@kernel.org,  kees@kernel.org,
	gnoack@google.com, jack@suse.cz, jmorris@namei.org,
	 serge@hallyn.com, song@kernel.org, yonghong.song@linux.dev,
	martin.lau@linux.dev,  m@maowtm.org, eddyz87@gmail.com,
	john.fastabend@gmail.com, sdf@fomichev.me,
	 skhan@linuxfoundation.org, bpf@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	 linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	 Frederick Lawler <fred@cloudflare.com>
Subject: Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets
Date: Wed, 8 Apr 2026 16:00:35 +0200	[thread overview]
Message-ID: <20260408.ong9Eshe0omu@digikod.net> (raw)
In-Reply-To: <20260407200157.3874806-1-utilityemal77@gmail.com>

Thanks for this RFC.

On Tue, Apr 07, 2026 at 04:01:22PM -0400, Justin Suess wrote:
> Hello,
> 
> This series lets sleepable BPF LSM programs apply an existing,
> userspace-created Landlock ruleset to a program during exec.
> 
> The goal is not to move Landlock policy definition into BPF, nor to create a
> second policy engine.  Instead, BPF is used only to select when an already
> valid Landlock ruleset should be applied, based on runtime exec context.
> 
> Background
> ===
> 
> Landlock is primarily a syscall-driven, unprivileged-first LSM.  That model
> works well when the application being sandboxed can create and enforce its own
> rulesets, or when a trusted launcher can impose restrictions directly before
> running a trusted target.
> 
> That becomes harder when the target program is not under first-party control,
> for example:
> 
> 1. third-party binaries,
> 2. unmodified container images,
> 3. programs reached through shells, wrappers, or service managers, and
> 4. user-supplied or otherwise untrusted code.
> 
> In these cases, an external supervisor may want to apply a Landlock ruleset to
> the final executed program, while leaving unrelated parents or helper
> processes alone.
> 
> Why external sandboxing is awkward today
> ===
> 
> There are two recurring problems.
> 
> First, userspace cannot reliably predict every file a target may need across
> different systems, packaging layouts, and runtime conditions.  Shared
> libraries, configuration files, interpreters, and helper binaries often depend
> on details that are only known at runtime.

Agreed, it would make sense to leverage eBPF for this context
identification rather than implementing a Landlock-specfic feature.

> 
> Second, Landlock inheritance is intentionally one-way.  Once a task is
> restricted, descendants inherit that domain and may only become more
> restricted.  This is exactly what Landlock should do, but it makes external
> sandboxing awkward when the program of interest is buried inside a larger exec
> chain.  Applying restrictions too early can affect unrelated intermediates;
> applying them too late misses the target entirely.

This makes sense too.

> 
> This series addresses that target-selection problem.
> 
> Overview
> ===
> 
> This series adds a small BPF-to-Landlock bridge:
> 
> 1. userspace creates a normal Landlock ruleset through the existing ABI;
> 2. userspace inserts that ruleset FD into a new
> 	BPF_MAP_TYPE_LANDLOCK_RULESET map;
> 3. a sleepable BPF LSM program attached to an exec-time hook looks up the
> 	ruleset; and
> 4. the program calls a kfunc to apply that ruleset to the new program's
> 	credentials before exec completes.
> 
> The important point is that BPF does not create, inspect, or mutate Landlock
> policy here.  It only decides whether to apply a ruleset that was already
> created and validated through Landlock's existing userspace API.

I like this approach.  It makes it possible for users enforce Landlock
security policies on arbitrary new executions.  Sandboxing at this
specific point is the best time because it ensures a consistency for the
whole lifetime of the process, whereas applying new restriction in the
middle of an execution would make the process unstable (if the request
doesn't come from the process itself).

> 
> Interface
> ===
> 
> The series adds:
> 
> 1. bpf_landlock_restrict_binprm(), which applies a referenced ruleset to
> 	struct linux_binprm credentials;
> 2. bpf_landlock_put_ruleset(), which releases a referenced ruleset; and
> 3. BPF_MAP_TYPE_LANDLOCK_RULESET, a specialized map type for holding
> 	references to Landlock rulesets originating from userspace file
> 	descriptors.
> 4. A new field in the linux_binprm struct to enable application of
>    task_set_no_new_privs once execution is beyond the point of no return.

This "beyond the point of no return" is indeed important, and it would
be nice to also have this property for Landlock restriction i.e., only
create a Landlock domain if we know that the execution will succeed (or
if the caller will exit).  This is especially important for
logging/tracing event consistency.

> 
> The kfuncs are restricted to sleepable BPF LSM programs attached to
> bprm_creds_for_exec and bprm_creds_from_file, which are the points where the
> new program's credentials may still be updated safely.
> 
> This series also adds LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS.  On the BPF path,
> this is staged through the exec context and committed only after exec reaches
> point-of-no-return.  This avoids side effects on failed executions while
> ensuring that the resulting task cannot gain more privileges through later exec
> transitions. This is done through the set_nnp_on_point_of_no_return field.
> 
> This has a little subtlety: LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS in the BPF
> path will not stop the current execution from escalating at all; only subsequent
> ones.

This makes sense too, but it needs to be documented.

> This is intentional to allow landlock policies to be applied through a

s/landlock/Landlock/g in every text/comment/commit description please.

> setuid transition for instance, without affecting the current escalation.
> 
> Semantics
> ===
> 
> This proposal is intended to preserve Landlock semantics as much as practical
> for an exec-time BPF attachment model:
> 
> 1. only pre-existing Landlock rulesets may be applied;
> 2. BPF cannot construct, inspect, or modify rulesets;

Inspection will be possible with tracepoints, but it is orthogonal to
this series.

> 3. enforcement still happens before the new program begins execution;
> 4. normal Landlock inheritance, layering, and future composition remain
> 	unchanged; and
> 5. this does not bypass Landlock's privilege checks for applying Landlock
>     rulesets.
> 
> In other words, BPF acts as an external selector for when to apply Landlock,
> not as a replacement for Landlock's enforcement engine.
> 
> All behavior, future access rights, and previous access rights are designed
> to automatically be supported from either BPF or existing syscall contexts.
> 
> The main semantic difference is LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS on the BPF
> path: it guarantees that the resulting task is pinned with no_new_privs before
> it can perform later exec transitions, but it does not retroactively suppress
> privilege gain for the current exec transition itself.
> 
> The other exception to semantics is the LANDLOCK_RESTRICT_SELF_TSYNC flag.
> (see Points of Feedback section)
> 
> Patch layout
> ===
> 
> Patches 1-5 prepare the Landlock side by moving shared ruleset logic out of
> syscalls.c, adding a no_new_privs flag for non-syscall callers, exposing
> linux_binprm->set_nnp_on_point_of_no_return as an interface to set no_new_privs
> on the point of no return, and making deferred ruleset destruction RCU-safe.
> 
> Patches 6-10 add the BPF-facing pieces: the Landlock kfuncs, the new map type,
> syscall handling for that map, and verifier support.
> 
> Patches 11-15 add selftests and the small bpftool update needed for the new
> map type.
> 
> Patches 16-20 add docs and bump the ABI version and update MAINTAINERS.
> 
> Feedback is especially welcome on the overall interface shape, the choice of
> hooks, and the map semantics.

I'll review each patch separately, but this approach is promising.

I think it would be simpler to have a dedicated patch series for
LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS, and then send another series
specific to the eBPF side (kfunc, tests, doc...).  I'm not sure what is
the best way to deal with dependencies across Landlock and BPF though.
What is the policy for BPF next wrt other next branches?

> 
> Testing
> ===
> 
> This patch series has two portions of tests.
> 
> One lives in the traditional Landlock selftests, for the new
> LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS flag.
> 
> The other suite lives under the BPF selftests, and this tests the Landlock
> kfuncs and the new BPF_MAP_TYPE_LANDLOCK_RULESET.
> 
> This patch series was run through BPF CI, the results of which are here. [1]
> 
> All mentioned tests are passing, as well as the BPF CI.
> 
> [1] : https://github.com/kernel-patches/bpf/pull/11562
> 
> Points of Feedback
> ===
> 
> First, the new set_nnp_on_point_of_no_return field in struct linux_binprm.
> This field was needed to request that task_set_no_new_privs be set during an
> execution, but only after the execution has proceeded beyond the point of no
> return. I couldn't find a way to express this semantic without adding a new
> bitfield to struct linux_binprm and a conditional in fs/exec.c. Please see
> patch 2.

What about using security_bprm_committing_creds()?

> 
> Feedback on the BPF testing harness, which was generated with AI assistance as
> disclosed in the commit footer, is welcomed. I have only limited familiarity
> with BPF testing practices. These tests were made with strong human supervision.
> See patches 14 and 15.
> 
> Feedback on the NO_NEW_PRIVS situation is also welcomed. Because task_set_no_new_privs()
> would otherwise leak state on failed executions or AT_EXECVE_CHECK, this series
> stages no_new_privs through the exec context and only commits it after
> point-of-no-return. This preserves failure behavior while still ensuring that
> the resulting task cannot elevate further through later exec transitions.
> When called from bprm_creds_from_file, this does not retroactively change the
> privilege outcome of the current exec transition itself.
> 
> See patch 2 and 3.
> 
> Next, the RCU in the landlock_ruleset. Existing BPF maps use RCU to make sure maps
> holding references stay valid. I altered the landlock ruleset to use rcu_work
> to make sure that the rcu is synchronized before putting on a ruleset, and
> acquire the rcu in the arraymap implementation. See patches 5-10.
> 
> Next, the semantics of the map. What operations should be supported from BPF
> and userspace and what data types should they return? I consider the struct
> bpf_landlock_ruleset to be opaque. Userspace can add items to the map via the
> fd, delete items by their index, and BPF can delete and lookup items by their
> index. Items cannot be updated, only swapped.
> 
> Finally, the handling of the LANDLOCK_RESTRICT_SELF_TSYNC flag. This flag has
> no meaning in a pre-execution context, as the credentials during the designated
> LSM hooks (bprm_creds_for_exec/creds_from_file) still represent the pre-execution
> task. Therefore, this flag is invalidated and attempting to use it with
> bpf_landlock_restrict_binprm will return -EINVAL. Otherwise, the flag would
> result in applying the landlock ruleset to the wrong target in addition to the
> intended one. (see patch 2). This behavior is validated with selftests.
> 
> Existing works / Credits
> ===
> 
> Mickaël Salaün created patchsets adding BPF tracepoints for landlock in [2] [3].
> 
> Mickaël also gave feedback on this feature and the idea in this GitHub thread. [4]
> 
> Günther Noack initially received and provided initial feedback on this idea as
> an early prototype.
> 
> Liz Rice, author of "Learning eBPF: Programming the Linux Kernel for Enhanced
> Observability, Networking, and Security" provided background and inspired me to
> experiment with BPF and the BPF LSM. [5]
> 
> [2] : https://lore.kernel.org/all/20250523165741.693976-1-mic@digikod.net/
> [3] : https://lore.kernel.org/linux-security-module/20260406143717.1815792-1-mic@digikod.net/
> [4] : https://github.com/landlock-lsm/linux/issues/56
> [5] : https://wellesleybooks.com/book/9781098135126
> 
> Kind Regards,
> Justin Suess
> 
> Justin Suess (20):
>   landlock: Move operations from syscall into ruleset code
>   execve: Add set_nnp_on_point_of_no_return
>   landlock: Implement LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
>   selftests/landlock: Cover LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
>   landlock: Make ruleset deferred free RCU safe
>   bpf: lsm: Add Landlock kfuncs
>   bpf: arraymap: Implement Landlock ruleset map
>   bpf: Add Landlock ruleset map type
>   bpf: syscall: Handle Landlock ruleset maps
>   bpf: verifier: Add Landlock ruleset map support
>   selftests/bpf: Add Landlock kfunc declarations
>   selftests/landlock: Rename gettid wrapper for BPF reuse
>   selftests/bpf: Enable Landlock in selftests kernel.
>   selftests/bpf: Add Landlock kfunc test program
>   selftests/bpf: Add Landlock kfunc test runner
>   landlock: Bump ABI version
>   tools: bpftool: Add documentation for landlock_ruleset
>   landlock: Document LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
>   bpf: Document BPF_MAP_TYPE_LANDLOCK_RULESET
>   MAINTAINERS: update entry for the Landlock subsystem
> 
>  Documentation/bpf/map_landlock_ruleset.rst    | 181 +++++
>  Documentation/userspace-api/landlock.rst      |  22 +-
>  MAINTAINERS                                   |   4 +
>  fs/exec.c                                     |   8 +
>  include/linux/binfmts.h                       |   7 +-
>  include/linux/bpf_lsm.h                       |  15 +
>  include/linux/bpf_types.h                     |   1 +
>  include/linux/landlock.h                      |  92 +++
>  include/uapi/linux/bpf.h                      |   1 +
>  include/uapi/linux/landlock.h                 |  14 +
>  kernel/bpf/arraymap.c                         |  67 ++
>  kernel/bpf/bpf_lsm.c                          | 145 ++++
>  kernel/bpf/syscall.c                          |   4 +-
>  kernel/bpf/verifier.c                         |  15 +-
>  samples/landlock/sandboxer.c                  |   7 +-
>  security/landlock/limits.h                    |   2 +-
>  security/landlock/ruleset.c                   | 198 ++++-
>  security/landlock/ruleset.h                   |  25 +-
>  security/landlock/syscalls.c                  | 158 +---
>  .../bpf/bpftool/Documentation/bpftool-map.rst |   2 +-
>  tools/bpf/bpftool/map.c                       |   2 +-
>  tools/include/uapi/linux/bpf.h                |   1 +
>  tools/lib/bpf/libbpf.c                        |   1 +
>  tools/lib/bpf/libbpf_probes.c                 |   6 +
>  tools/testing/selftests/bpf/bpf_kfuncs.h      |  20 +
>  tools/testing/selftests/bpf/config            |   5 +
>  tools/testing/selftests/bpf/config.x86_64     |   1 -
>  .../bpf/prog_tests/landlock_kfuncs.c          | 733 ++++++++++++++++++
>  .../selftests/bpf/progs/landlock_kfuncs.c     |  92 +++
>  tools/testing/selftests/landlock/base_test.c  |  10 +-
>  tools/testing/selftests/landlock/common.h     |  28 +-
>  tools/testing/selftests/landlock/fs_test.c    | 103 +--
>  tools/testing/selftests/landlock/net_test.c   |  55 +-
>  .../testing/selftests/landlock/ptrace_test.c  |  14 +-
>  .../landlock/scoped_abstract_unix_test.c      |  51 +-
>  .../selftests/landlock/scoped_base_variants.h |  23 +
>  .../selftests/landlock/scoped_common.h        |   5 +-
>  .../selftests/landlock/scoped_signal_test.c   |  30 +-
>  tools/testing/selftests/landlock/wrappers.h   |   2 +-
>  39 files changed, 1877 insertions(+), 273 deletions(-)
>  create mode 100644 Documentation/bpf/map_landlock_ruleset.rst
>  create mode 100644 include/linux/landlock.h
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/landlock_kfuncs.c
>  create mode 100644 tools/testing/selftests/bpf/progs/landlock_kfuncs.c
> 
> 
> base-commit: 8c6a27e02bc55ab110d1828610048b19f903aaec
> -- 
> 2.53.0
> 
>

     prev parent reply	other threads:[~2026-04-08 14:00 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-07 20:01 [RFC PATCH 00/20] BPF interface for applying Landlock rulesets Justin Suess
2026-04-07 20:01 ` [RFC PATCH 01/20] landlock: Move operations from syscall into ruleset code Justin Suess
2026-04-07 20:01 ` [RFC PATCH 02/20] execve: Add set_nnp_on_point_of_no_return Justin Suess
2026-04-07 20:01 ` [RFC PATCH 03/20] landlock: Implement LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS Justin Suess
2026-04-07 20:01 ` [RFC PATCH 04/20] selftests/landlock: Cover LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS Justin Suess
2026-04-07 20:01 ` [RFC PATCH 05/20] landlock: Make ruleset deferred free RCU safe Justin Suess
2026-04-07 20:01 ` [RFC PATCH 06/20] bpf: lsm: Add Landlock kfuncs Justin Suess
2026-04-07 20:01 ` [RFC PATCH 07/20] bpf: arraymap: Implement Landlock ruleset map Justin Suess
2026-04-07 20:01 ` [RFC PATCH 08/20] bpf: Add Landlock ruleset map type Justin Suess
2026-04-07 20:01 ` [RFC PATCH 09/20] bpf: syscall: Handle Landlock ruleset maps Justin Suess
2026-04-07 20:01 ` [RFC PATCH 10/20] bpf: verifier: Add Landlock ruleset map support Justin Suess
2026-04-07 20:01 ` [RFC PATCH 11/20] selftests/bpf: Add Landlock kfunc declarations Justin Suess
2026-04-07 20:01 ` [RFC PATCH 12/20] selftests/landlock: Rename gettid wrapper for BPF reuse Justin Suess
2026-04-07 20:01 ` [RFC PATCH 13/20] selftests/bpf: Enable Landlock in selftests kernel Justin Suess
2026-04-07 20:01 ` [RFC PATCH 14/20] selftests/bpf: Add Landlock kfunc test program Justin Suess
2026-04-07 20:01 ` [RFC PATCH 15/20] selftests/bpf: Add Landlock kfunc test runner Justin Suess
2026-04-07 20:01 ` [RFC PATCH 16/20] landlock: Bump ABI version Justin Suess
2026-04-07 20:01 ` [RFC PATCH 17/20] tools: bpftool: Add documentation for landlock_ruleset Justin Suess
2026-04-07 20:01 ` [RFC PATCH 18/20] landlock: Document LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS Justin Suess
2026-04-07 20:01 ` [RFC PATCH 19/20] bpf: Document BPF_MAP_TYPE_LANDLOCK_RULESET Justin Suess
2026-04-07 20:01 ` [RFC PATCH 20/20] MAINTAINERS: update entry for the Landlock subsystem Justin Suess
2026-04-08  4:40 ` [RFC PATCH 00/20] BPF interface for applying Landlock rulesets Ihor Solodrai
2026-04-08 11:41   ` Justin Suess
2026-04-08 14:00 ` Mickaël Salaün [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260408.ong9Eshe0omu@digikod.net \
    --to=mic@digikod.net \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=fred@cloudflare.com \
    --cc=gnoack@google.com \
    --cc=jack@suse.cz \
    --cc=jmorris@namei.org \
    --cc=john.fastabend@gmail.com \
    --cc=kees@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=m@maowtm.org \
    --cc=martin.lau@linux.dev \
    --cc=paul@paul-moore.com \
    --cc=sdf@fomichev.me \
    --cc=serge@hallyn.com \
    --cc=skhan@linuxfoundation.org \
    --cc=song@kernel.org \
    --cc=utilityemal77@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox