* Re: [PATCH 05/11] hornet: gen_sig: fix off-by-one check for used maps
From: Blaise Boscaccy @ 2026-05-29 18:03 UTC (permalink / raw)
To: Paul Moore
Cc: Jonathan Corbet, Shuah Khan, James Morris, Serge E. Hallyn,
Eric Biggers, Fan Wu, James.Bottomley, linux-security-module
In-Reply-To: <CAHC9VhR6G5qmd3kXPas_L_SiJx=6J=wUw80xxL9Eu4=tSjMAoQ@mail.gmail.com>
Paul Moore <paul@paul-moore.com> writes:
> On Wed, May 27, 2026 at 11:09 PM Blaise Boscaccy
> <bboscaccy@linux.microsoft.com> wrote:
>>
>> A logic bug limited the maximum number of used maps to
>> MAX_USED_MAPS-1.
>
> Should this be MAX_HASHES-1 and not MAX_USED_MAPS-1?
>
Good eye. Yes that should be MAX_HASHES-1 in the commit message.
>> Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
>> ---
>> scripts/hornet/gen_sig.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/scripts/hornet/gen_sig.c b/scripts/hornet/gen_sig.c
>> index b4f983ab24bcd..4e8caad22f381 100644
>> --- a/scripts/hornet/gen_sig.c
>> +++ b/scripts/hornet/gen_sig.c
>> @@ -317,11 +317,11 @@ int main(int argc, char **argv)
>> data_path = optarg;
>> break;
>> case 'A':
>> - hashes[hash_count].file = optarg;
>> - if (++hash_count >= MAX_HASHES) {
>> + if (hash_count >= MAX_HASHES) {
>> usage(argv[0]);
>> return EXIT_FAILURE;
>> }
>> + hashes[hash_count++].file = optarg;
>> break;
>> default:
>> usage(argv[0]);
>> --
>> 2.53.0
>
> --
> paul-moore.com
^ permalink raw reply
* Re: [PATCH bpf v3 2/2] bpf, libbpf: reject non-exclusive metadata maps in the signed loader
From: Alexei Starovoitov @ 2026-05-29 15:01 UTC (permalink / raw)
To: Daniel Borkmann
Cc: KP Singh, bpf, LSM List, Alexei Starovoitov,
Kumar Kartikeya Dwivedi
In-Reply-To: <544dbc0d-24d2-423f-9db4-07976d67a9d0@iogearbox.net>
On Fri, May 29, 2026 at 5:25 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 5/23/26 5:12 PM, Alexei Starovoitov wrote:
> > On Fri, May 22, 2026 at 11:53 PM KP Singh <kpsingh@kernel.org> wrote:
> >>
> >> The loader verifies map->sha against the metadata hash in its
> >> instructions. map->sha is calculated when BPF_OBJ_GET_INFO_BY_FD is called
> >> on the frozen map.
> >>
> >> While the map is frozen, the loader must also ensure the map is
> >> exclusive, as, without exclusivity, another BPF program with map access
> >> can mutate the contents afterwards, so the check passes on stale data.
> >
> > Hold on. How is this an issue? excl_prog_sha guarantees
> > that only loader prog can use this map.
> > Are you saying the same loader prog will use the same map
> > for the 2nd time. Ok. I still don't see a problem.
> >
> >> Place excl_prog_sha right after sha[] in struct bpf_map and have
> >> gen_loader bail with -EINVAL when it is NULL, via BPF_PSEUDO_MAP_IDX at
> >> fixed offset 32. The 8-byte read of the pointer field limits this to
> >> 64-bit kernels; gen_loader needs target pointer size tracking to emit
> >> the right sized read on 32-bit (follow-up).
> >
> > I don't think we can go from maybe-racy to certainly-broken-on-32-bit.
> > So only applied patch 1.
>
> I've looked a bit more into it with regards to above question from Alexei
> as well as the __bpf_md_ptr issue.
>
> Imho, KP is correct that the extra check/enforcement is needed. So Alice
> as a trusted signer generates the loader program (loader_insns + data_blob)
> and signs it. The loader program contains the below enforcement to reject
> if the metadata map was not exclusive.
>
> Now the (untrusted) host that wants to load the program, it holds a signed
> loader where they can't change a byte of it without breaking the signature.
>
> However, it could simply omit excl_prog_hash on BPF_MAP_CREATE for the data
> map (which would "normally" be bound exclusively to the loader).
>
> Then check_map_prog_compatibility() enforcement is skipped on verifier side
> given excl_prog_sha is not set. The loader loads fine, the fingerprint check
> can then pass against a stale snapshot while a different program mangled the
> data_blob underneath.
>
> Regarding __bpf_md_ptr, I would solve it differently via fixed size, see below
> together with the excl check coming before the signature check in the loader
> and the build bug assertions, and a jmp not eq to 1.
>
> include/linux/bpf.h | 1 +
> kernel/bpf/syscall.c | 5 +++++
> tools/lib/bpf/gen_loader.c | 17 +++++++++++++++++
> 3 files changed, 23 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index cd191c5fdb0a..487f4653d8a6 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -295,6 +295,7 @@ struct bpf_map_owner {
>
> struct bpf_map {
> u8 sha[SHA256_DIGEST_SIZE];
> + u32 excl;
> const struct bpf_map_ops *ops;
> struct bpf_map *inner_map_meta;
> #ifdef CONFIG_SECURITY
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 630d530782fe..37dacdbc5c01 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1572,6 +1572,11 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
> err = -EFAULT;
> goto free_map;
> }
> +
> + /* See libbpf: emit_signature_match() */
> + BUILD_BUG_ON(offsetof(struct bpf_map, excl) != SHA256_DIGEST_SIZE);
> + BUILD_BUG_ON(offsetof(struct bpf_map, sha) != 0);
> + map->excl = 1;
> } else if (attr->excl_prog_hash_size) {
> err = -EINVAL;
> goto free_map;
> diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
> index bcea21c3b7bb..cd8d7df94ac7 100644
> --- a/tools/lib/bpf/gen_loader.c
> +++ b/tools/lib/bpf/gen_loader.c
> @@ -586,6 +586,23 @@ static void emit_signature_match(struct bpf_gen *gen)
> __s64 off;
> int i;
>
> + /*
> + * Reject if the metadata map is not exclusive. Without exclusivity
> + * the cached map->sha[] verified above can be stale: another BPF
> + * program with map access could have mutated the contents between
> + * BPF_OBJ_GET_INFO_BY_FD and loader execution.
> + */
> + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX,
> + 0, 0, 0, 0));
> + emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, SHA256_DIGEST_LENGTH));
> + off = -(gen->insn_cur - gen->insn_start - gen->cleanup_label) / 8 - 2;
> + if (is_simm16(off)) {
> + emit(gen, BPF_MOV64_IMM(BPF_REG_7, -EINVAL));
> + emit(gen, BPF_JMP_IMM(BPF_JNE, BPF_REG_2, 1, off));
> + } else {
> + gen->error = -ERANGE;
> + }
yeah. much cleaner. ship it.
^ permalink raw reply
* Re: [PATCH v5 12/13] ima: Return error on deleting measurements already copied during kexec
From: Roberto Sassu @ 2026-05-29 14:59 UTC (permalink / raw)
To: Mimi Zohar, corbet, skhan, dmitry.kasatkin, eric.snowberg, paul,
jmorris, serge
Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
gregorylumen, chenste, nramas, Roberto Sassu
In-Reply-To: <ea886419ef3047ede1885504fad8f865cdcc5ce3.camel@linux.ibm.com>
On Tue, 2026-05-26 at 10:02 -0400, Mimi Zohar wrote:
> On Wed, 2026-04-29 at 18:03 +0200, Roberto Sassu wrote:
> > From: Roberto Sassu <roberto.sassu@huawei.com>
> >
> > Refuse to delete staged or active list measurements, if a kexec racing with
> > the deletion already copied those measurements in the kexec buffer. In this
> > way, user space becomes aware that those measurements are going to appear
> > in the secondary kernel, and thus they don't have to be saved twice.
>
> There are two reboot notifiers: one to prevent additional measurements extending
> the TPM, while the other copies the measurements for kexec. This patch prevents
> deleting the staged measurements after the latter notifier.
>
> Instead of introducing a specific method for detecting whether the measurement
> list has been copied, rely on one of the two existing reboot notifiers. The
> simplest method would test "ima_measurements_suspended", which would prevent
> deleting the staged measurements a bit earlier.
Testing that the reboot notifier fired (with the
ima_measurements_suspended variable) is not enough to know whether the
measurements dump took place or not.
We need a flag (one is enough) protected by ima_extend_list_mutex, so
that we know reliably which event occurred first, or the dump or the
staging/delete (which are also protected by ima_extend_list_mutex).
Roberto
^ permalink raw reply
* Re: [PATCH] tpm-buf: memory-safe allocations
From: James Bottomley @ 2026-05-29 14:08 UTC (permalink / raw)
To: Jarkko Sakkinen
Cc: linux-integrity, Jarkko Sakkinen, Arun Menon, Daniel P. Smith,
Alec Brown, Ross Philipson, Stefan Berger, Peter Huewe,
Jason Gunthorpe, Mimi Zohar, David Howells, Paul Moore,
James Morris, Serge E. Hallyn, linux-kernel, keyrings,
linux-security-module
In-Reply-To: <ahVRefyT4BTKOu0m@kernel.org>
On Tue, 2026-05-26 at 10:53 +0300, Jarkko Sakkinen wrote:
> On Mon, May 25, 2026 at 01:50:51PM -0400, James Bottomley wrote:
> > On Fri, 2026-05-22 at 04:35 +0300, Jarkko Sakkinen wrote:
> > > Decouple kzalloc from buffer creation, so that a managed
> > > allocation
> > > can be
> > > used:
> > >
> > > struct tpm_buf *buf __free(kfree) buf =
> > > kzalloc(TPM_BUFSIZE,
> > > GFP_KERNEL);
> > > if (!buf)
> > > return -ENOMEM;
> > >
> > > tpm_buf_init(buf, TPM_BUFSIZE);
> > >
> > > Alternatively, stack allocations are also possible:
> > >
> > > u8 buf_data[512];
> > > struct tpm_buf *buf = (struct tpm_buf *)buf_data;
> > > tpm_buf_init(buf, sizeof(buf_data));
> >
> > This isn't really a good idea from a security point of view.
> > Remember the buffer has to be big enough for both the sent and the
> > received data. Today we simply set TPM_BUFSIZE to the maximum
> > amount a TPM requires and all the send and receives just work. If
> > we let callers set this size, we're asking for them to get it wrong
> > (or at least forget about the receive part) and for us to get a DMA
> > overrun from the TPM ... which might be potentially exploitable
> > depending on how it occurs (think of an unseal of user chosen data
> > overrunning).
>
> It's one patch so you're free to remark the call sites where this
> happens. This is not a majorn concern at all.
Nearly twenty years ago, when the kernel was a lot smaller, a then
kernel luminary called Rusty Russell realized we needed to pay much
more attention to how we design APIs inside the kernel if we wanted it
to grow successfully. He published his initial thoughts and gave talks
at both the kernel summit and OLS on it:
https://ozlabs.org/~rusty/index.cgi/tech/2008-03-18.html
The key point that's always stuck with me is "hard to misuse beats easy
to use". Later he came up with a rating scale (now known as the Rusty
API classification):
https://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html
and for chuckles and grins on April fools day he came up with a
negative rating ridiculing some of our dafter API choices:
https://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html
The point for this patch set is that the sizing of the original tpm_buf
interface scores 10/10 on the Rusty scale (it's impossible to get
wrong). Simply threading size through the whole API, as this patch
does, may look like the right answer, but it causes a massive reduction
in API score. In fact, since the buffer has to be sized not only
according to what goes in, but also what gets returned and this is
nowhere mentioned in the new documentation it scores -3 (read the
documentation and you can still get it wrong). Now by mentioning the
sizing problems in the doc, you can probably get it up to +3 (read the
documentation and you'll get it right) but my question was not if you
got it wrong somewhere in the patch but whether we couldn't do a whole
lot better in terms of API score by designing a better API.
A key point about the 185 version of the TPM spec is that it's really
only a few commands that need larger buffers (the Post Quantum ML-KEM
keys) which doesn't apply to most of the in-kernel TPM callsites.
Since tpm_buf_init takes the ordinal, we can actually tell at runtime
(or compile time if the ordinal is a constant) if the command would
need a larger buffer. We can also tell from the TPM properties whether
the TPM itself can take a larger buffer, so for every current TPM we
could retain the original score 10/10 API and warn at runtime if there
might be a problem. Then the larger keys seem to fit into 8k, so we
could still retain most of the original API properties of being
difficult to misuse simply by having an 8k size flag (which we could
ignore if the TPM doesn't support it) and warn at runtime if
tpm_buf_init sends an ordinal which might need a larger buffer. At
worst we should be able to get to an API which scores 5/10 (do it right
or it will break at runtime).
Regards,
James
^ permalink raw reply
* Re: [PATCH bpf v3 2/2] bpf, libbpf: reject non-exclusive metadata maps in the signed loader
From: Daniel Borkmann @ 2026-05-29 12:25 UTC (permalink / raw)
To: Alexei Starovoitov, KP Singh
Cc: bpf, LSM List, Alexei Starovoitov, Kumar Kartikeya Dwivedi
In-Reply-To: <CAADnVQLJsvCfRxyLT-NJRubwSPTNd0k5bEp45Zyu9q1B_3oG+A@mail.gmail.com>
On 5/23/26 5:12 PM, Alexei Starovoitov wrote:
> On Fri, May 22, 2026 at 11:53 PM KP Singh <kpsingh@kernel.org> wrote:
>>
>> The loader verifies map->sha against the metadata hash in its
>> instructions. map->sha is calculated when BPF_OBJ_GET_INFO_BY_FD is called
>> on the frozen map.
>>
>> While the map is frozen, the loader must also ensure the map is
>> exclusive, as, without exclusivity, another BPF program with map access
>> can mutate the contents afterwards, so the check passes on stale data.
>
> Hold on. How is this an issue? excl_prog_sha guarantees
> that only loader prog can use this map.
> Are you saying the same loader prog will use the same map
> for the 2nd time. Ok. I still don't see a problem.
>
>> Place excl_prog_sha right after sha[] in struct bpf_map and have
>> gen_loader bail with -EINVAL when it is NULL, via BPF_PSEUDO_MAP_IDX at
>> fixed offset 32. The 8-byte read of the pointer field limits this to
>> 64-bit kernels; gen_loader needs target pointer size tracking to emit
>> the right sized read on 32-bit (follow-up).
>
> I don't think we can go from maybe-racy to certainly-broken-on-32-bit.
> So only applied patch 1.
I've looked a bit more into it with regards to above question from Alexei
as well as the __bpf_md_ptr issue.
Imho, KP is correct that the extra check/enforcement is needed. So Alice
as a trusted signer generates the loader program (loader_insns + data_blob)
and signs it. The loader program contains the below enforcement to reject
if the metadata map was not exclusive.
Now the (untrusted) host that wants to load the program, it holds a signed
loader where they can't change a byte of it without breaking the signature.
However, it could simply omit excl_prog_hash on BPF_MAP_CREATE for the data
map (which would "normally" be bound exclusively to the loader).
Then check_map_prog_compatibility() enforcement is skipped on verifier side
given excl_prog_sha is not set. The loader loads fine, the fingerprint check
can then pass against a stale snapshot while a different program mangled the
data_blob underneath.
Regarding __bpf_md_ptr, I would solve it differently via fixed size, see below
together with the excl check coming before the signature check in the loader
and the build bug assertions, and a jmp not eq to 1.
include/linux/bpf.h | 1 +
kernel/bpf/syscall.c | 5 +++++
tools/lib/bpf/gen_loader.c | 17 +++++++++++++++++
3 files changed, 23 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cd191c5fdb0a..487f4653d8a6 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -295,6 +295,7 @@ struct bpf_map_owner {
struct bpf_map {
u8 sha[SHA256_DIGEST_SIZE];
+ u32 excl;
const struct bpf_map_ops *ops;
struct bpf_map *inner_map_meta;
#ifdef CONFIG_SECURITY
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 630d530782fe..37dacdbc5c01 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1572,6 +1572,11 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
err = -EFAULT;
goto free_map;
}
+
+ /* See libbpf: emit_signature_match() */
+ BUILD_BUG_ON(offsetof(struct bpf_map, excl) != SHA256_DIGEST_SIZE);
+ BUILD_BUG_ON(offsetof(struct bpf_map, sha) != 0);
+ map->excl = 1;
} else if (attr->excl_prog_hash_size) {
err = -EINVAL;
goto free_map;
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
index bcea21c3b7bb..cd8d7df94ac7 100644
--- a/tools/lib/bpf/gen_loader.c
+++ b/tools/lib/bpf/gen_loader.c
@@ -586,6 +586,23 @@ static void emit_signature_match(struct bpf_gen *gen)
__s64 off;
int i;
+ /*
+ * Reject if the metadata map is not exclusive. Without exclusivity
+ * the cached map->sha[] verified above can be stale: another BPF
+ * program with map access could have mutated the contents between
+ * BPF_OBJ_GET_INFO_BY_FD and loader execution.
+ */
+ emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX,
+ 0, 0, 0, 0));
+ emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, SHA256_DIGEST_LENGTH));
+ off = -(gen->insn_cur - gen->insn_start - gen->cleanup_label) / 8 - 2;
+ if (is_simm16(off)) {
+ emit(gen, BPF_MOV64_IMM(BPF_REG_7, -EINVAL));
+ emit(gen, BPF_JMP_IMM(BPF_JNE, BPF_REG_2, 1, off));
+ } else {
+ gen->error = -ERANGE;
+ }
+
for (i = 0; i < SHA256_DWORD_SIZE; i++) {
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX,
0, 0, 0, 0));
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v4 1/2] rust: task: clarify comments on task UID accessors
From: Gary Guo @ 2026-05-29 12:17 UTC (permalink / raw)
To: Alice Ryhl, Paul Moore, Serge Hallyn, Jonathan Corbet,
Greg Kroah-Hartman, Shuah Khan, Alex Shi, Yanteng Si,
Dongliang Mu
Cc: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
Jann Horn, linux-security-module, linux-doc, linux-kernel,
rust-for-linux
In-Reply-To: <20260529-remove-task-euid-v4-1-07cbdf3af980@google.com>
On Fri May 29, 2026 at 10:33 AM BST, Alice Ryhl wrote:
> From: Jann Horn <jannh@google.com>
>
> Linux has separate subjective and objective task credentials, see the
> comment above `struct cred`. Clarify which accessor functions operate on
> which set of credentials.
>
> Also document that Task::euid() is a very weird operation. You can see how
> weird it is by grepping for task_euid() - binder is its only user.
> Task::euid() obtains the objective effective UID - it looks at the
> credentials of the task for purposes of acting on it as an object, but then
> accesses the effective UID (which the credentials.7 man page describes as
> "[...] used by the kernel to determine the permissions that the process
> will have when accessing shared resources [...]").
>
> For context:
> Arguably, binder's use of task_euid() is a theoretical security problem,
> which only has no impact on Android because Android has no setuid binaries
> executable by apps.
> commit 29bc22ac5e5b ("binder: use euid from cred instead of using task")
> fixed that by removing that only user of task_euid(), but the fix got
> reverted in commit c21a80ca0684 ("binder: fix test regression due to
> sender_euid change") because some Android test started failing.
>
> Signed-off-by: Jann Horn <jannh@google.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
> ---
> Originally sent as:
> https://lore.kernel.org/r/20260212-rust-uid-v1-1-deff4214c766@google.com
> ---
> rust/kernel/task.rs | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
^ permalink raw reply
* Re: [REPORT] landlock: SCOPE_SIGNAL bypass via F_SETOWN to invoker pgid -> SIGIO/SIGKILL to non-sandboxed targets
From: Mickaël Salaün @ 2026-05-29 11:08 UTC (permalink / raw)
To: hexlabsecurity
Cc: Justin Suess, gnoack@google.com,
linux-security-module@vger.kernel.org, stable@vger.kernel.org
In-Reply-To: <TSwHGN3I-u6p6xv7CqnvDOhR3la_kQWq0rdjBdA0gt30AsYLwddoxjCCFmqXcQMxWHS4ShULEp7sO_8HdFRGPLk30rIQHy3EurwJyrjP3NQ=@proton.me>
Hi,
Thanks for the report. Could you please replace the reproducer code
with a proper kselftest?
That would need to be a new email patch (v3) as explained here:
https://docs.kernel.org/process/submitting-patches.html
Regards,
Mickaël
On Fri, May 29, 2026 at 04:43:02AM +0000, hexlabsecurity@proton.me wrote:
> Thanks Justin -- much appreciated for reproducing on mic/next and for the
> Tested-by.
>
> v2 below addresses your review:
> - the commit message is trimmed to just the bug and the fix;
> - the reproducer and the A/B verification are moved below the --- so
> they become git notes, not part of the commit;
> - added your Tested-by.
>
> The fix hunk is unchanged. I agree the concise statement of the defect is
> "we fail to check the subject on fan-out signal types (PIDTYPE_PGID and
> PIDTYPE_SID, i.e. type > PIDTYPE_TGID)". The patch keeps the explicit
> PIDTYPE_PGID / PIDTYPE_SID test for readability and to stay robust if the
> enum is ever reordered -- happy to switch to "> PIDTYPE_TGID" if you
> prefer. I'll follow up separately on the erratum entry and a regression
> test, as you suggested.
>
> Independent security researcher. HEXLAB SAS (registration pending) --
> Cali, Colombia.
>
> Thanks,
> Bryam Vargas
>
> ----- v2 patch (inline, plain text) -----
>
> From 75f801309cd64f74d04ef86236bd973314dd7d94 Mon Sep 17 00:00:00 2001
> From: Bryam Vargas <hexlabsecurity@proton.me>
> Date: Thu, 28 May 2026 23:33:13 -0500
> Subject: [PATCH v2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to
> invoker's pgid
>
> A Landlock-restricted process can bypass LANDLOCK_SCOPE_SIGNAL on the
> SIGIO delivery path and deliver arbitrary signals (including SIGKILL via
> F_SETSIG) to non-Landlocked targets that share its pgid, by exploiting a
> producer-side cache-vs-live evaluation gap.
>
> The SIGIO path in hook_file_send_sigiotask() consults a cached subject
> stored in landlock_file(file)->fown_subject at fcntl(F_SETOWN) time
> (via hook_file_set_fowner()), instead of evaluating the live Landlock
> domain of the invoking task at signal-send time. The capture is gated
> by control_current_fowner(), which returns false (skipping capture)
> when pid_task(fown->pid, fown->pid_type) is in current's thread group.
>
> This is correct for PIDTYPE_TGID / PIDTYPE_PID, where the target is a
> single task sharing current's cred. It is unsafe for PIDTYPE_PGID and
> PIDTYPE_SID: when current is at the head of its pgid hlist -- the
> default placement after fork(), hlist_add_head_rcu() in kernel/fork.c --
> pid_task(pgid, PIDTYPE_PGID) resolves to current itself,
> same_thread_group(current, current) is true, the capture is skipped, and
> fown_subject.domain stays NULL. hook_file_send_sigiotask() then
> short-circuits at "if (!subject->domain) return 0;", letting the kernel
> fan the signal out to every member of the group, including tasks outside
> current's Landlock domain that SCOPE_SIGNAL is supposed to protect.
>
> The direct kill() path (hook_task_kill) is unaffected: it evaluates
> current's live domain on every call. Only the cached SIGIO path is
> broken.
>
> Tighten control_current_fowner() to apply the thread-group exemption
> only when the target identifies a single task whose Landlock cred is
> necessarily shared with current (PIDTYPE_TGID, PIDTYPE_PID). For
> PIDTYPE_PGID and PIDTYPE_SID, always capture the current Landlock
> subject so the consumer's scope check runs against every member of the
> group at delivery time.
>
> Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
> Tested-by: Justin Suess <utilityemal77@gmail.com>
> Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
> ---
> v2: per review, the commit message is trimmed to the bug + the fix; the
> reproducer and the A/B verification are moved below the --- so they
> stay out of the commit. Added Tested-by. The hunk is unchanged from
> v1 (v1 sent to security@kernel.org 2026-05-28, embargoed -- not yet
> in a public archive).
>
> Reproducer (ordinary unprivileged user; sandbox active in the child):
>
> int pfd[2]; pipe(pfd);
> landlock_create_ruleset(&{.scoped = LANDLOCK_SCOPE_SIGNAL},
> sizeof(attr), 0);
> prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
> landlock_restrict_self(rfd, 0);
> fcntl(pfd[0], F_SETSIG, SIGKILL);
> fcntl(pfd[0], F_SETOWN, -getpgrp()); /* PIDTYPE_PGID */
> fcntl(pfd[0], F_SETFL, O_ASYNC);
> write(pfd[1], "X", 1); /* trigger SIGIO */
> /* every pgid member receives SIGKILL, including the non-sandboxed
> * parent / supervisor / sibling workers */
>
> A/B-verified on a 6.12.90 lab kernel (same .config, only this hunk
> differs): pre-fix the sandboxed child's SIGKILL reaches the
> non-sandboxed parent (SCOPE_SIGNAL bypassed); post-fix it is blocked.
> hook_task_kill's direct-kill enforcement and the intra-thread-group
> F_SETOWN cases continue to work post-patch.
>
> security/landlock/fs.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index c1ecfe239032..edaa52572cbd 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -1909,6 +1909,18 @@ static bool control_current_fowner(struct fown_struct *const fown)
> if (!p)
> return true;
>
> + /*
> + * For PIDTYPE_PGID and PIDTYPE_SID, signal delivery fans out to
> + * every member of the group at SIGIO time. Even when pid_task()
> + * resolves to current itself (e.g., current is the pgid hlist
> + * head post-fork), non-current members of the group are still
> + * valid targets that must be checked by hook_file_send_sigiotask().
> + * Always capture the current subject for those types so the
> + * consumer scope check runs against the live fown_subject.
> + */
> + if (fown->pid_type == PIDTYPE_PGID || fown->pid_type == PIDTYPE_SID)
> + return true;
> +
> return !same_thread_group(p, current);
> }
> --
> 2.43.0
^ permalink raw reply
* [PATCH v4 2/2] cred: delete task_euid()
From: Alice Ryhl @ 2026-05-29 9:33 UTC (permalink / raw)
To: Paul Moore, Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman,
Shuah Khan, Alex Shi, Yanteng Si, Dongliang Mu
Cc: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
Jann Horn, linux-security-module, linux-doc, linux-kernel,
rust-for-linux, Alice Ryhl
In-Reply-To: <20260529-remove-task-euid-v4-0-07cbdf3af980@google.com>
task_euid() is a very weird operation. You can see how weird it is by
grepping for task_euid() - binder is its only user. task_euid() obtains
the objective effective UID - it looks at the credentials of the task
for purposes of acting on it as an object, but then accesses the
effective UID (which the credentials.7 man page describes as "[...] used
by the kernel to determine the permissions that the process will have
when accessing shared resources [...]").
Since usage in Binder has now been removed, get rid of the resulting
dead code.
Changes to the zh_CN translation was carried out with the help of
Gemini and Google Translate, and since adjusted as per Alex Shi's
feedback.
Suggested-by: Jann Horn <jannh@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Documentation/security/credentials.rst | 6 ++----
Documentation/translations/zh_CN/security/credentials.rst | 4 +---
include/linux/cred.h | 1 -
rust/helpers/task.c | 5 -----
rust/kernel/task.rs | 10 ----------
5 files changed, 3 insertions(+), 23 deletions(-)
diff --git a/Documentation/security/credentials.rst b/Documentation/security/credentials.rst
index d0191c8b8060..81d3b5737d85 100644
--- a/Documentation/security/credentials.rst
+++ b/Documentation/security/credentials.rst
@@ -393,16 +393,14 @@ the credentials so obtained when they're finished with.
The result of ``__task_cred()`` should not be passed directly to
``get_cred()`` as this may race with ``commit_cred()``.
-There are a couple of convenience functions to access bits of another task's
-credentials, hiding the RCU magic from the caller::
+There is a convenience function to access bits of another task's credentials,
+hiding the RCU magic from the caller::
uid_t task_uid(task) Task's real UID
- uid_t task_euid(task) Task's effective UID
If the caller is holding the RCU read lock at the time anyway, then::
__task_cred(task)->uid
- __task_cred(task)->euid
should be used instead. Similarly, if multiple aspects of a task's credentials
need to be accessed, RCU read lock should be used, ``__task_cred()`` called,
diff --git a/Documentation/translations/zh_CN/security/credentials.rst b/Documentation/translations/zh_CN/security/credentials.rst
index 88fcd9152ffe..20c8696f8198 100644
--- a/Documentation/translations/zh_CN/security/credentials.rst
+++ b/Documentation/translations/zh_CN/security/credentials.rst
@@ -337,15 +337,13 @@ const指针上操作,因此不需要进行类型转换,但需要临时放弃
``__task_cred()`` 的结果不应直接传递给 ``get_cred()`` ,
因为这可能与 ``commit_cred()`` 发生竞争条件。
-还有一些方便的函数可以访问另一个任务凭据的特定部分,将RCU操作对调用方隐藏起来::
+有一个方便的函数可用于访问另一个任务凭据的特定部分,从而对调用方隐藏RCU机制::
uid_t task_uid(task) Task's real UID
- uid_t task_euid(task) Task's effective UID
如果调用方在此时已经持有RCU读锁,则应使用::
__task_cred(task)->uid
- __task_cred(task)->euid
类似地,如果需要访问任务凭据的多个方面,应使用RCU读锁,调用 ``__task_cred()``
函数,将结果存储在临时指针中,然后从临时指针中调用凭据的各个方面,最后释放锁。
diff --git a/include/linux/cred.h b/include/linux/cred.h
index c6676265a985..6ef1750c93e2 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -371,7 +371,6 @@ DEFINE_FREE(put_cred, struct cred *, if (!IS_ERR_OR_NULL(_T)) put_cred(_T))
})
#define task_uid(task) (task_cred_xxx((task), uid))
-#define task_euid(task) (task_cred_xxx((task), euid))
#define task_ucounts(task) (task_cred_xxx((task), ucounts))
#define current_cred_xxx(xxx) \
diff --git a/rust/helpers/task.c b/rust/helpers/task.c
index c0e1a06ede78..b46b1433a67e 100644
--- a/rust/helpers/task.c
+++ b/rust/helpers/task.c
@@ -28,11 +28,6 @@ __rust_helper kuid_t rust_helper_task_uid(struct task_struct *task)
return task_uid(task);
}
-__rust_helper kuid_t rust_helper_task_euid(struct task_struct *task)
-{
- return task_euid(task);
-}
-
#ifndef CONFIG_USER_NS
__rust_helper uid_t rust_helper_from_kuid(struct user_namespace *to, kuid_t uid)
{
diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index eabd65bfde12..c2b3457b700c 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -217,16 +217,6 @@ pub fn uid(&self) -> Kuid {
Kuid::from_raw(unsafe { bindings::task_uid(self.as_ptr()) })
}
- /// Returns the objective effective UID of the given task.
- ///
- /// You should probably not be using this; the effective UID is normally
- /// only relevant in subjective credentials.
- #[inline]
- pub fn euid(&self) -> Kuid {
- // SAFETY: It's always safe to call `task_euid` on a valid task.
- Kuid::from_raw(unsafe { bindings::task_euid(self.as_ptr()) })
- }
-
/// Determines whether the given task has pending signals.
#[inline]
pub fn signal_pending(&self) -> bool {
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related
* [PATCH v4 1/2] rust: task: clarify comments on task UID accessors
From: Alice Ryhl @ 2026-05-29 9:33 UTC (permalink / raw)
To: Paul Moore, Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman,
Shuah Khan, Alex Shi, Yanteng Si, Dongliang Mu
Cc: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
Jann Horn, linux-security-module, linux-doc, linux-kernel,
rust-for-linux, Alice Ryhl
In-Reply-To: <20260529-remove-task-euid-v4-0-07cbdf3af980@google.com>
From: Jann Horn <jannh@google.com>
Linux has separate subjective and objective task credentials, see the
comment above `struct cred`. Clarify which accessor functions operate on
which set of credentials.
Also document that Task::euid() is a very weird operation. You can see how
weird it is by grepping for task_euid() - binder is its only user.
Task::euid() obtains the objective effective UID - it looks at the
credentials of the task for purposes of acting on it as an object, but then
accesses the effective UID (which the credentials.7 man page describes as
"[...] used by the kernel to determine the permissions that the process
will have when accessing shared resources [...]").
For context:
Arguably, binder's use of task_euid() is a theoretical security problem,
which only has no impact on Android because Android has no setuid binaries
executable by apps.
commit 29bc22ac5e5b ("binder: use euid from cred instead of using task")
fixed that by removing that only user of task_euid(), but the fix got
reverted in commit c21a80ca0684 ("binder: fix test regression due to
sender_euid change") because some Android test started failing.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Originally sent as:
https://lore.kernel.org/r/20260212-rust-uid-v1-1-deff4214c766@google.com
---
rust/kernel/task.rs | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index 38273f4eedb5..eabd65bfde12 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -210,14 +210,17 @@ pub fn pid(&self) -> Pid {
unsafe { *ptr::addr_of!((*self.as_ptr()).pid) }
}
- /// Returns the UID of the given task.
+ /// Returns the objective real UID of the given task.
#[inline]
pub fn uid(&self) -> Kuid {
// SAFETY: It's always safe to call `task_uid` on a valid task.
Kuid::from_raw(unsafe { bindings::task_uid(self.as_ptr()) })
}
- /// Returns the effective UID of the given task.
+ /// Returns the objective effective UID of the given task.
+ ///
+ /// You should probably not be using this; the effective UID is normally
+ /// only relevant in subjective credentials.
#[inline]
pub fn euid(&self) -> Kuid {
// SAFETY: It's always safe to call `task_euid` on a valid task.
@@ -371,7 +374,7 @@ fn eq(&self, other: &Self) -> bool {
impl Eq for Task {}
impl Kuid {
- /// Get the current euid.
+ /// Get the current subjective effective UID.
#[inline]
pub fn current_euid() -> Kuid {
// SAFETY: Just an FFI call.
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related
* [PATCH v4 0/2] Delete task_euid()
From: Alice Ryhl @ 2026-05-29 9:33 UTC (permalink / raw)
To: Paul Moore, Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman,
Shuah Khan, Alex Shi, Yanteng Si, Dongliang Mu
Cc: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
Jann Horn, linux-security-module, linux-doc, linux-kernel,
rust-for-linux, Alice Ryhl
The task_euid() method is a very weird method, and Binder was the only
user. As of commit 65b672152289 ("binder: use current_euid() for
transaction sender identity") Binder doesn't use task_euid() anymore,
so we can delete this method.
My suggestion would be to merge this through the LSM tree.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Changes in v4:
- Reword 'euid' -> 'effective UID' in 'Kuid::current_euid()' docs.
- Link to v3: https://lore.kernel.org/r/20260507-remove-task-euid-v3-0-27f22f335c2c@google.com
Changes in v3:
- Include 'task' clarification commit in series.
- Rebase and resend.
- Link to v2: https://lore.kernel.org/r/20260227-remove-task-euid-v2-1-9a9c80a82eb6@google.com
Changes in v2:
- Update translation as per Alex Shi.
- Pick up Reviewed-by Gary.
- Update commit title to use cred: prefix.
- Link to v1: https://lore.kernel.org/r/20260219-remove-task-euid-v1-1-904060826e07@google.com
---
Alice Ryhl (1):
cred: delete task_euid()
Jann Horn (1):
rust: task: clarify comments on task UID accessors
Documentation/security/credentials.rst | 6 ++----
Documentation/translations/zh_CN/security/credentials.rst | 4 +---
include/linux/cred.h | 1 -
rust/helpers/task.c | 5 -----
rust/kernel/task.rs | 11 ++---------
5 files changed, 5 insertions(+), 22 deletions(-)
---
base-commit: 7fd2df204f342fc17d1a0bfcd474b24232fb0f32
change-id: 20260219-remove-task-euid-19e4b00beebe
Best regards,
--
Alice Ryhl <aliceryhl@google.com>
^ permalink raw reply
* Re: [REPORT] landlock: SCOPE_SIGNAL bypass via F_SETOWN to invoker pgid -> SIGIO/SIGKILL to non-sandboxed targets
From: hexlabsecurity @ 2026-05-29 4:43 UTC (permalink / raw)
To: Justin Suess
Cc: mic@digikod.net, gnoack@google.com,
linux-security-module@vger.kernel.org, stable@vger.kernel.org
Thanks Justin -- much appreciated for reproducing on mic/next and for the
Tested-by.
v2 below addresses your review:
- the commit message is trimmed to just the bug and the fix;
- the reproducer and the A/B verification are moved below the --- so
they become git notes, not part of the commit;
- added your Tested-by.
The fix hunk is unchanged. I agree the concise statement of the defect is
"we fail to check the subject on fan-out signal types (PIDTYPE_PGID and
PIDTYPE_SID, i.e. type > PIDTYPE_TGID)". The patch keeps the explicit
PIDTYPE_PGID / PIDTYPE_SID test for readability and to stay robust if the
enum is ever reordered -- happy to switch to "> PIDTYPE_TGID" if you
prefer. I'll follow up separately on the erratum entry and a regression
test, as you suggested.
Independent security researcher. HEXLAB SAS (registration pending) --
Cali, Colombia.
Thanks,
Bryam Vargas
----- v2 patch (inline, plain text) -----
From 75f801309cd64f74d04ef86236bd973314dd7d94 Mon Sep 17 00:00:00 2001
From: Bryam Vargas <hexlabsecurity@proton.me>
Date: Thu, 28 May 2026 23:33:13 -0500
Subject: [PATCH v2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to
invoker's pgid
A Landlock-restricted process can bypass LANDLOCK_SCOPE_SIGNAL on the
SIGIO delivery path and deliver arbitrary signals (including SIGKILL via
F_SETSIG) to non-Landlocked targets that share its pgid, by exploiting a
producer-side cache-vs-live evaluation gap.
The SIGIO path in hook_file_send_sigiotask() consults a cached subject
stored in landlock_file(file)->fown_subject at fcntl(F_SETOWN) time
(via hook_file_set_fowner()), instead of evaluating the live Landlock
domain of the invoking task at signal-send time. The capture is gated
by control_current_fowner(), which returns false (skipping capture)
when pid_task(fown->pid, fown->pid_type) is in current's thread group.
This is correct for PIDTYPE_TGID / PIDTYPE_PID, where the target is a
single task sharing current's cred. It is unsafe for PIDTYPE_PGID and
PIDTYPE_SID: when current is at the head of its pgid hlist -- the
default placement after fork(), hlist_add_head_rcu() in kernel/fork.c --
pid_task(pgid, PIDTYPE_PGID) resolves to current itself,
same_thread_group(current, current) is true, the capture is skipped, and
fown_subject.domain stays NULL. hook_file_send_sigiotask() then
short-circuits at "if (!subject->domain) return 0;", letting the kernel
fan the signal out to every member of the group, including tasks outside
current's Landlock domain that SCOPE_SIGNAL is supposed to protect.
The direct kill() path (hook_task_kill) is unaffected: it evaluates
current's live domain on every call. Only the cached SIGIO path is
broken.
Tighten control_current_fowner() to apply the thread-group exemption
only when the target identifies a single task whose Landlock cred is
necessarily shared with current (PIDTYPE_TGID, PIDTYPE_PID). For
PIDTYPE_PGID and PIDTYPE_SID, always capture the current Landlock
subject so the consumer's scope check runs against every member of the
group at delivery time.
Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
Tested-by: Justin Suess <utilityemal77@gmail.com>
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
v2: per review, the commit message is trimmed to the bug + the fix; the
reproducer and the A/B verification are moved below the --- so they
stay out of the commit. Added Tested-by. The hunk is unchanged from
v1 (v1 sent to security@kernel.org 2026-05-28, embargoed -- not yet
in a public archive).
Reproducer (ordinary unprivileged user; sandbox active in the child):
int pfd[2]; pipe(pfd);
landlock_create_ruleset(&{.scoped = LANDLOCK_SCOPE_SIGNAL},
sizeof(attr), 0);
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
landlock_restrict_self(rfd, 0);
fcntl(pfd[0], F_SETSIG, SIGKILL);
fcntl(pfd[0], F_SETOWN, -getpgrp()); /* PIDTYPE_PGID */
fcntl(pfd[0], F_SETFL, O_ASYNC);
write(pfd[1], "X", 1); /* trigger SIGIO */
/* every pgid member receives SIGKILL, including the non-sandboxed
* parent / supervisor / sibling workers */
A/B-verified on a 6.12.90 lab kernel (same .config, only this hunk
differs): pre-fix the sandboxed child's SIGKILL reaches the
non-sandboxed parent (SCOPE_SIGNAL bypassed); post-fix it is blocked.
hook_task_kill's direct-kill enforcement and the intra-thread-group
F_SETOWN cases continue to work post-patch.
security/landlock/fs.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index c1ecfe239032..edaa52572cbd 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -1909,6 +1909,18 @@ static bool control_current_fowner(struct fown_struct *const fown)
if (!p)
return true;
+ /*
+ * For PIDTYPE_PGID and PIDTYPE_SID, signal delivery fans out to
+ * every member of the group at SIGIO time. Even when pid_task()
+ * resolves to current itself (e.g., current is the pgid hlist
+ * head post-fork), non-current members of the group are still
+ * valid targets that must be checked by hook_file_send_sigiotask().
+ * Always capture the current subject for those types so the
+ * consumer scope check runs against the live fown_subject.
+ */
+ if (fown->pid_type == PIDTYPE_PGID || fown->pid_type == PIDTYPE_SID)
+ return true;
+
return !same_thread_group(p, current);
}
--
2.43.0
^ permalink raw reply related
* [PATCH] KEYS: Use acquire when reading state in keyring search
From: Gui-Dong Han @ 2026-05-29 3:34 UTC (permalink / raw)
To: keyrings, dhowells, jarkko
Cc: ebiggers, linux-security-module, linux-kernel, baijiaju1990,
Gui-Dong Han
The negative-key race fix added release/acquire ordering for key use.
Publish payload before state; read state before payload.
keyring_search_iterator() still uses READ_ONCE() before match callbacks.
An asymmetric match callback calls asymmetric_key_ids(), which reads
key->payload.data[asym_key_ids].
Use key_read_state() there to complete that ordering.
Fixes: 363b02dab09b ("KEYS: Fix race between updating and finding a negative key")
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
---
Found by auditing READ_ONCE() used for synchronization.
A similar fix can be found in 8df672bfe3ec.
---
security/keys/keyring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index b39038f7dd31..243fb1636f10 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -576,7 +576,7 @@ static int keyring_search_iterator(const void *object, void *iterator_data)
struct keyring_search_context *ctx = iterator_data;
const struct key *key = keyring_ptr_to_key(object);
unsigned long kflags = READ_ONCE(key->flags);
- short state = READ_ONCE(key->state);
+ short state = key_read_state(key);
kenter("{%d}", key->serial);
--
2.34.1
^ permalink raw reply related
* [BUG] apparmor: AA_BUG aa_policy_destroy on aa_alloc_profile error path
From: Farhad Alemi @ 2026-05-29 3:32 UTC (permalink / raw)
To: John Johansen
Cc: falemi, Tiffany Bao, Adam Doupé, Fish Wang,
Yan Shoshitaishvili, Paul Moore, James Morris, Serge E. Hallyn,
apparmor, linux-security-module, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 3589 bytes --]
Hello John and the AppArmor team,
I am reporting an AppArmor AA_BUG WARN in aa_policy_destroy() found
by syzkaller as part of research at the SEFCOM Lab at ASU.
Summary:
A write(2) to /proc/<pid>/attr/<lsm>/current that drives the
aa_change_hat() -> aa_new_learning_profile() -> aa_alloc_null() ->
aa_alloc_profile() chain takes the error-rollback path at
security/apparmor/policy.c:409 (aa_alloc_profile()'s `fail:` label
calling aa_free_profile(profile)). aa_free_profile() then calls
aa_policy_destroy(&profile->base) at security/apparmor/policy.c:327,
which trips its first AA_BUG at security/apparmor/lib.c:509:
void aa_policy_destroy(struct aa_policy *policy)
{
AA_BUG(on_list_rcu(&policy->profiles)); <-- :509
AA_BUG(on_list_rcu(&policy->list));
...
}
/* security/apparmor/include/policy.h:60 */
#define on_list_rcu(X) (!list_empty(X) && (X)->prev != LIST_POISON2)
The WARN reproduces the macro's condition verbatim (the kernel prints
the full stringified expression including the LIST_POISON2 numeric
0x122 + 0xdead000000000000UL); see crash-report.txt for the full
header.
Observed on:
- Linux v7.1-rc3-200-g70eda68668d1-dirty (the only local dirty file
is drivers/tty/serial/serial_core.c, a console guard our fuzzing
harness uses, unrelated to security/apparmor/), x86_64, QEMU Q35
- AA_BUG asserts enabled + panic_on_warn (the crash tail prints
"Kernel panic - not syncing: kernel: panic_on_warn set")
- Source inspection of linus/master at commit e8c2f9fdadee
(v7.1-rc4-754-ge8c2f9fdadee) shows the buggy structure is
unchanged: security/apparmor/lib.c:509 still does
AA_BUG(on_list_rcu(&policy->profiles)); aa_alloc_profile()'s fail
path at security/apparmor/policy.c:409 still calls
aa_free_profile(profile); aa_free_profile() at policy.c:327 still
calls aa_policy_destroy(&profile->base). As no reproducer is available
for this seed, I have not re-triggered the crash against e8c2f9fdadee.
Expected behavior:
Either aa_alloc_profile()'s rollback path must guarantee
profile->base.profiles is empty (or list_del'd so prev == LIST_POISON2)
before calling aa_free_profile(), or aa_policy_destroy()'s AA_BUG
should be softened to a WARN_ON-and-drain so it does not panic on an
alloc-rollback path. The maintainers are best placed to choose which
side of the contract owns this.
Reproducer:
A standalone .syz or C reproducer was not produced for this seed;
the crash fired during automated /proc/<pid>/attr/* fuzzing. The
console report is attached as crash-report.txt.
Novelty check:
I searched the syzbot dashboard's upstream open, fixed, stable, and
invalid (per-subsystem apparmor) namespaces; the Android dashboard;
the marc.info linux-security-module archive; and the complete
apparmor@lists.ubuntu.com list archive (2010 through 2026, full
message bodies), for "aa_policy_destroy", "on_list_rcu(&policy->
profiles)", "aa_alloc_profile" + "WARNING", and "AA_BUG" +
"policy->profiles". I did not find a prior report of this crash. The
three apparmor-titled entries in the syzbot invalid namespace are in
different functions (apparmor_sk_free_security UAF, aa_label_sk_perm
UAF, apparmor_file_open data-race). The only aa_policy_destroy
mentions on the AppArmor list are a 2022 "Fix memleak in alloc_ns()"
patch (a different aa_policy_destroy(&ns->base) call site), and there
is no occurrence of on_list_rcu(&policy->profiles) anywhere in the
list history.
I appreciate your time and consideration, and I'm grateful for your
work on this subsystem. I'd be glad to test any candidate patches.
Regards,
[-- Attachment #2: crash-report.txt --]
[-- Type: text/plain, Size: 8182 bytes --]
</TASK>
------------[ cut here ]------------
AppArmor WARN aa_policy_destroy: (((!list_empty(&policy->profiles) && (&policy->profiles)->prev != ((void *) 0x122 + (0xdead000000000000UL))))):
WARNING: security/apparmor/lib.c:509 at aa_policy_destroy+0x169/0x1c0 security/apparmor/lib.c:509, CPU#0: syz.3.739/13898
Modules linked in:
CPU: 0 UID: 0 PID: 13898 Comm: syz.3.739 Not tainted 7.1.0-rc3-00200-g70eda68668d1-dirty #1 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:aa_policy_destroy+0x170/0x1c0 security/apparmor/lib.c:509
Code: 85 ed 7e 4d e8 c1 9a dc fd 5b 41 5c 41 5e 41 5f 5d c3 cc cc cc cc cc e8 ae 9a dc fd 48 8d 3d 87 1c 0b 05 48 c7 c6 b8 a7 82 87 <67> 48 0f b9 3a e9 04 ff ff ff e8 91 9a dc fd 48 8d 3d 7a 1c 0b 05
RSP: 0018:ffffc9000141f500 EFLAGS: 00010293
RAX: ffffffff83a572b2 RBX: ffff88811907a400 RCX: ffff88812f778000
RDX: 0000000000000000 RSI: ffffffff8782a7b8 RDI: ffffffff88b08f40
RBP: 0000000000000cc0 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: dffffc0000000000 R11: fffffbfff100a27f R12: dead000000000122
R13: ffff88811907a400 R14: ffff88811907a428 R15: dffffc0000000000
FS: 00007f51fd2d76c0(0000) GS:ffff8882ab6b6000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f51fe8cfe10 CR3: 000000011b6ea000 CR4: 0000000000750ef0
PKRU: 80000000
Call Trace:
<TASK>
aa_free_profile+0xa2/0x9f0 security/apparmor/policy.c:327
aa_alloc_profile+0x1f1/0x3f0 security/apparmor/policy.c:409
aa_alloc_null+0x2d/0x530 security/apparmor/policy.c:690
aa_new_learning_profile+0x226/0x4e0 security/apparmor/policy.c:767
build_change_hat+0x292/0x400 security/apparmor/domain.c:1079
change_hat security/apparmor/domain.c:1193 [inline]
aa_change_hat+0x1177/0x2fb0 security/apparmor/domain.c:1269
aa_setprocattr_changehat+0x4a6/0x5b0 security/apparmor/procattr.c:138
do_setattr+0x548/0x6a0
proc_pid_attr_write+0x5d1/0x630 fs/proc/base.c:2844
vfs_write+0x29f/0xb90 fs/read_write.c:686
ksys_write+0x155/0x270 fs/read_write.c:740
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x15f/0x560 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f51fe88778d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f51fd2d7018 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f51feb15fa0 RCX: 00007f51fe88778d
RDX: 0000000000000022 RSI: 00002000000000c0 RDI: 0000000000000003
RBP: 00007f51fd2d7080 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007f51feb16038 R14: 00007f51feb15fa0 R15: 00007ffc4916b870
</TASK>
----------------
Code disassembly (best guess):
0: 85 ed test %ebp,%ebp
2: 7e 4d jle 0x51
4: e8 c1 9a dc fd call 0xfddc9aca
9: 5b pop %rbx
a: 41 5c pop %r12
c: 41 5e pop %r14
e: 41 5f pop %r15
10: 5d pop %rbp
11: c3 ret
12: cc int3
13: cc int3
14: cc int3
15: cc int3
16: cc int3
17: e8 ae 9a dc fd call 0xfddc9aca
1c: 48 8d 3d 87 1c 0b 05 lea 0x50b1c87(%rip),%rdi # 0x50b1caa
23: 48 c7 c6 b8 a7 82 87 mov $0xffffffff8782a7b8,%rsi
* 2a: 67 48 0f b9 3a ud1 (%edx),%rdi <-- trapping instruction
2f: e9 04 ff ff ff jmp 0xffffff38
34: e8 91 9a dc fd call 0xfddc9aca
39: 48 8d 3d 7a 1c 0b 05 lea 0x50b1c7a(%rip),%rdi # 0x50b1cba
<<<<<<<<<<<<<<< tail report >>>>>>>>>>>>>>>
Modules linked in:
CPU: 0 UID: 0 PID: 13898 Comm: syz.3.739 Not tainted 7.1.0-rc3-00200-g70eda68668d1-dirty #1 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:aa_policy_destroy+0x170/0x1c0
Code: 85 ed 7e 4d e8 c1 9a dc fd 5b 41 5c 41 5e 41 5f 5d c3 cc cc cc cc cc e8 ae 9a dc fd 48 8d 3d 87 1c 0b 05 48 c7 c6 b8 a7 82 87 <67> 48 0f b9 3a e9 04 ff ff ff e8 91 9a dc fd 48 8d 3d 7a 1c 0b 05
RSP: 0018:ffffc9000141f500 EFLAGS: 00010293
RAX: ffffffff83a572b2 RBX: ffff88811907a400 RCX: ffff88812f778000
RDX: 0000000000000000 RSI: ffffffff8782a7b8 RDI: ffffffff88b08f40
RBP: 0000000000000cc0 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: dffffc0000000000 R11: fffffbfff100a27f R12: dead000000000122
R13: ffff88811907a400 R14: ffff88811907a428 R15: dffffc0000000000
FS: 00007f51fd2d76c0(0000) GS:ffff8882ab6b6000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f51fe8cfe10 CR3: 000000011b6ea000 CR4: 0000000000750ef0
PKRU: 80000000
Call Trace:
<TASK>
aa_free_profile+0xa2/0x9f0
aa_alloc_profile+0x1f1/0x3f0
aa_alloc_null+0x2d/0x530
aa_new_learning_profile+0x226/0x4e0
build_change_hat+0x292/0x400
aa_change_hat+0x1177/0x2fb0
aa_setprocattr_changehat+0x4a6/0x5b0
do_setattr+0x548/0x6a0
proc_pid_attr_write+0x5d1/0x630
vfs_write+0x29f/0xb90
ksys_write+0x155/0x270
do_syscall_64+0x15f/0x560
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f51fe88778d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f51fd2d7018 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f51feb15fa0 RCX: 00007f51fe88778d
RDX: 0000000000000022 RSI: 00002000000000c0 RDI: 0000000000000003
RBP: 00007f51fd2d7080 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007f51feb16038 R14: 00007f51feb15fa0 R15: 00007ffc4916b870
</TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 0 UID: 0 PID: 13898 Comm: syz.3.739 Not tainted 7.1.0-rc3-00200-g70eda68668d1-dirty #1 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
<TASK>
vpanic+0x571/0xa60
panic+0xca/0xd0
__warn+0x31a/0x4d0
__report_bug+0x29a/0x540
report_bug_entry+0x19a/0x290
handle_bug+0xce/0x200
exc_invalid_op+0x1a/0x50
asm_exc_invalid_op+0x1a/0x20
RIP: 0010:aa_policy_destroy+0x170/0x1c0
Code: 85 ed 7e 4d e8 c1 9a dc fd 5b 41 5c 41 5e 41 5f 5d c3 cc cc cc cc cc e8 ae 9a dc fd 48 8d 3d 87 1c 0b 05 48 c7 c6 b8 a7 82 87 <67> 48 0f b9 3a e9 04 ff ff ff e8 91 9a dc fd 48 8d 3d 7a 1c 0b 05
RSP: 0018:ffffc9000141f500 EFLAGS: 00010293
RAX: ffffffff83a572b2 RBX: ffff88811907a400 RCX: ffff88812f778000
RDX: 0000000000000000 RSI: ffffffff8782a7b8 RDI: ffffffff88b08f40
RBP: 0000000000000cc0 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: dffffc0000000000 R11: fffffbfff100a27f R12: dead000000000122
R13: ffff88811907a400 R14: ffff88811907a428 R15: dffffc0000000000
aa_free_profile+0xa2/0x9f0
aa_alloc_profile+0x1f1/0x3f0
aa_alloc_null+0x2d/0x530
aa_new_learning_profile+0x226/0x4e0
build_change_hat+0x292/0x400
aa_change_hat+0x1177/0x2fb0
aa_setprocattr_changehat+0x4a6/0x5b0
do_setattr+0x548/0x6a0
proc_pid_attr_write+0x5d1/0x630
vfs_write+0x29f/0xb90
ksys_write+0x155/0x270
do_syscall_64+0x15f/0x560
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f51fe88778d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f51fd2d7018 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f51feb15fa0 RCX: 00007f51fe88778d
RDX: 0000000000000022 RSI: 00002000000000c0 RDI: 0000000000000003
RBP: 00007f51fd2d7080 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007f51feb16038 R14: 00007f51feb15fa0 R15: 00007ffc4916b870
</TASK>
Kernel Offset: disabled
Rebooting in 86400 seconds..
<<<<<<<<<<<<<<< tail report >>>>>>>>>>>>>>>
^ permalink raw reply
* Re: [PATCH] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to invoker's pgid
From: Justin Suess @ 2026-05-29 3:25 UTC (permalink / raw)
To: hexlabsecurity
Cc: mic@digikod.net, gnoack@google.com,
linux-security-module@vger.kernel.org, stable@vger.kernel.org
In-Reply-To: <cFjmBkbTY-D5pYl66NixBeqbhWBzS7kBEUHCWbhTQwkiuvKg8xNkSEf9rYqDQiD76er1gK8Q6t1YOJ4nIPuvILuwG42d8_rfMZpQ5VmJru0=@proton.me>
On Thu, May 28, 2026 at 09:21:50PM +0000, hexlabsecurity@proton.me wrote:
> From 22a0086b44beaaef01883e047dd4a8b8bc3153e9 Mon Sep 17 00:00:00 2001
> From: Bryam Vargas <hexlabsecurity@proton.me>
> Date: Thu, 28 May 2026 01:30:00 -0500
> Subject: [PATCH] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to
> invoker's pgid
>
> A Landlock-restricted process can bypass LANDLOCK_SCOPE_SIGNAL on the
> SIGIO delivery path and deliver arbitrary signals (including SIGKILL via
> F_SETSIG) to non-Landlocked targets that share its pgid, by exploiting a
> producer-side cache-vs-live evaluation gap.
>
> The SIGIO path in hook_file_send_sigiotask() consults a cached subject
> stored in landlock_file(file)->fown_subject at fcntl(F_SETOWN) time
> (via hook_file_set_fowner()), instead of evaluating the live Landlock
> domain of the invoking task at signal-send time. The capture is gated
> by control_current_fowner(), which returns false (skipping capture)
> when pid_task(fown->pid, fown->pid_type) is in current's thread group.
>
> This is correct for PIDTYPE_TGID / PIDTYPE_PID, where the target is a
> single thread or thread-group leader sharing current's cred. It is
> unsafe for PIDTYPE_PGID and PIDTYPE_SID: when current is at the head
> of its pgid hlist -- the default placement after fork(),
> hlist_add_head_rcu() in kernel/fork.c -- pid_task(pgid, PIDTYPE_PGID)
> resolves to current itself, same_thread_group(current, current) is
> true, the capture is skipped, and fown_subject.domain stays NULL.
>
> hook_file_send_sigiotask() then short-circuits at
> "if (!subject->domain) return 0;", allowing the kernel to fan the
> signal out to every member of the group, including tasks outside
> current's Landlock domain that the SCOPE_SIGNAL contract is supposed
> to protect.
>
> The direct kill() path (hook_task_kill) is unaffected: it evaluates
> current's live domain on every call. Only the cached SIGIO path is
> broken.
>
> Repro (ordinary unprivileged user; sandbox active in the child):
>
> int pfd[2]; pipe(pfd);
> landlock_create_ruleset(&{.scoped = LANDLOCK_SCOPE_SIGNAL},
> sizeof(attr), 0);
> prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
> landlock_restrict_self(rfd, 0);
> fcntl(pfd[0], F_SETSIG, SIGKILL);
> fcntl(pfd[0], F_SETOWN, -getpgrp()); /* PIDTYPE_PGID */
> fcntl(pfd[0], F_SETFL, O_ASYNC);
> write(pfd[1], "X", 1); /* trigger SIGIO */
> /* every pgid member receives SIGKILL, including non-sandboxed
> * parent / supervisor / sibling workers */
>
I was able to reproduce this on mic/next.
Great catch!
> Tighten control_current_fowner() to apply the thread-group exemption
> only when the target identifies a SINGLE task whose Landlock cred is
> necessarily shared with current (PIDTYPE_TGID, PIDTYPE_PID). For
> PIDTYPE_PGID and PIDTYPE_SID, always capture the current Landlock
> subject so the consumer's scope check runs against every member of
> the group at delivery time.
>
> Empirically A/B-verified on a 6.12.90 lab kernel (same .config, only
> the patch hunk differs): pre-fix build exits with "BUG PRESENT --
> SCOPE_SIGNAL BYPASSED", post-fix build exits with "SANDBOX HELD".
> hook_task_kill's direct-kill enforcement and the intra-thread-group
> F_SETOWN cases continue to work post-patch.
>
> Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
> Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
> ---
> security/landlock/fs.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index c1ecfe239032..edaa52572cbd 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -1909,6 +1909,18 @@ static bool control_current_fowner(struct fown_struct *const fown)
> if (!p)
> return true;
>
> + /*
> + * For PIDTYPE_PGID and PIDTYPE_SID, signal delivery fans out to
> + * every member of the group at SIGIO time. Even when pid_task()
> + * resolves to current itself (e.g., current is the pgid hlist
> + * head post-fork), non-current members of the group are still
> + * valid targets that must be checked by hook_file_send_sigiotask().
> + * Always capture the current subject for those types so the
> + * consumer scope check runs against the live fown_subject.
> + */
> + if (fown->pid_type == PIDTYPE_PGID || fown->pid_type == PIDTYPE_SID)
> + return true;
This seems right.
So basically we are failing to check the subject on fan-out
signals where type > PIDTYPE_TGID (ie PIDTYPE_PGID/SID).
But this fix seems good as is to me and closed the reproducer hole in my
test. Unless there are some edge cases I'm missing.
The commit message could use some cleanup and shortening. No need to
include the reproducer (though it was helpful) and the "BUG_PRESENT"/
"SANDBOX_HELD"/ AB testing stuff. Just explain the bug and what
it fixes :)
You can add the reproducer and stuff below the --- in the patch and
above the diffstat in the future to make it part of the git notes and
not the actual commit.
That way you can add anything else that doesn't belong in the actual
commit but is important for context.
This may need an erratum entry and a regression test in the future,
but that can be done seperately.
Again great job!
Tested-by: Justin Suess <utilityemal77@gmail.com>
> +
> return !same_thread_group(p, current);
> }
>
> --
> 2.43.0
>
^ permalink raw reply
* Re: [PATCH v9 4/9] samples/landlock: Add quiet flag support to sandboxer
From: Justin Suess @ 2026-05-29 2:34 UTC (permalink / raw)
To: Tingmao Wang
Cc: Mickaël Salaün, Günther Noack, Jan Kara,
Abhinav Saxena, linux-security-module
In-Reply-To: <7d5ad9631a51df6c2b857ff9c0122ff8ed491b7d.1779843375.git.m@maowtm.org>
On Wed, May 27, 2026 at 02:01:14AM +0100, Tingmao Wang wrote:
> Adds ability to set which access bits to quiet via LL_*_QUIET_ACCESS (FS,
> NET or SCOPED), and attach quiet flags to individual objects via
> LL_*_QUIET for FS and NET.
>
> Signed-off-by: Tingmao Wang <m@maowtm.org>
> ---
>
> Changes in v9:
> - Add udp connect / bind quiet flag support
>
> Changes in v8:
> - Rebase on top of mic/next
> - populate_ruleset_net() already does not require the env var to be
> present, so remove redundant comment and check above
> populate_ruleset_net(ENV_NET_QUIET_NAME, ...).
>
> Changes in v6:
> - Make populate_ruleset_{fs,net} take a flags argument instead of a bool
> quiet (suggested by Justin Suess)
> - Fix if braces style
>
> Changes in v3:
> - Minor change to the above commit message.
>
> Changes in v2:
> - Added new environment variables to control which quiet access bits to
> set on the rule, and populate quiet_access_* from it.
> - Added support for quieting net rules and scoped access. Renamed patch
> title.
> - Increment ABI version
>
> samples/landlock/sandboxer.c | 134 ++++++++++++++++++++++++++++++++---
> 1 file changed, 123 insertions(+), 11 deletions(-)
>
> diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
> index 94e399e6b146..74ee53afed6a 100644
> --- a/samples/landlock/sandboxer.c
> +++ b/samples/landlock/sandboxer.c
> @@ -58,9 +58,14 @@ static inline int landlock_restrict_self(const int ruleset_fd,
>
> #define ENV_FS_RO_NAME "LL_FS_RO"
> #define ENV_FS_RW_NAME "LL_FS_RW"
> +#define ENV_FS_QUIET_NAME "LL_FS_QUIET"
> +#define ENV_FS_QUIET_ACCESS_NAME "LL_FS_QUIET_ACCESS"
> #define ENV_TCP_BIND_NAME "LL_TCP_BIND"
> #define ENV_TCP_CONNECT_NAME "LL_TCP_CONNECT"
> +#define ENV_NET_QUIET_NAME "LL_NET_QUIET"
> +#define ENV_NET_QUIET_ACCESS_NAME "LL_NET_QUIET_ACCESS"
> #define ENV_SCOPED_NAME "LL_SCOPED"
> +#define ENV_SCOPED_QUIET_ACCESS_NAME "LL_SCOPED_QUIET_ACCESS"
> #define ENV_FORCE_LOG_NAME "LL_FORCE_LOG"
> #define ENV_UDP_BIND_NAME "LL_UDP_BIND"
> #define ENV_UDP_CONNECT_SEND_NAME "LL_UDP_CONNECT_SEND"
> @@ -119,7 +124,7 @@ static int parse_path(char *env_path, const char ***const path_list)
> /* clang-format on */
>
> static int populate_ruleset_fs(const char *const env_var, const int ruleset_fd,
> - const __u64 allowed_access)
> + const __u64 allowed_access, __u32 flags)
> {
> int num_paths, i, ret = 1;
> char *env_path_name;
> @@ -169,7 +174,7 @@ static int populate_ruleset_fs(const char *const env_var, const int ruleset_fd,
> if (!S_ISDIR(statbuf.st_mode))
> path_beneath.allowed_access &= ACCESS_FILE;
> if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
> - &path_beneath, 0)) {
> + &path_beneath, flags)) {
> fprintf(stderr,
> "Failed to update the ruleset with \"%s\": %s\n",
> path_list[i], strerror(errno));
> @@ -187,7 +192,7 @@ static int populate_ruleset_fs(const char *const env_var, const int ruleset_fd,
> }
>
> static int populate_ruleset_net(const char *const env_var, const int ruleset_fd,
> - const __u64 allowed_access)
> + const __u64 allowed_access, __u32 flags)
> {
> int ret = 1;
> char *env_port_name, *env_port_name_next, *strport;
> @@ -215,7 +220,7 @@ static int populate_ruleset_net(const char *const env_var, const int ruleset_fd,
> }
> net_port.port = port;
> if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
> - &net_port, 0)) {
> + &net_port, flags)) {
> fprintf(stderr,
> "Failed to update the ruleset with port \"%llu\": %s\n",
> net_port.port, strerror(errno));
> @@ -303,6 +308,58 @@ static bool check_ruleset_scope(const char *const env_var,
>
> /* clang-format on */
>
> +static int add_quiet_access(__u64 *const quiet_access,
> + const __u64 handled_access,
> + const char *const env_var, const bool default_all)
> +{
> + char *env_quiet_access, *env_quiet_access_next, *str_access;
> +
> + if (default_all)
> + *quiet_access = handled_access;
> + else
> + *quiet_access = 0;
> +
> + env_quiet_access = getenv(env_var);
> + if (!env_quiet_access)
> + return 0;
> +
> + env_quiet_access = strdup(env_quiet_access);
> + env_quiet_access_next = env_quiet_access;
> + unsetenv(env_var);
> + *quiet_access = 0;
> +
> + while ((str_access = strsep(&env_quiet_access_next, ENV_DELIMITER))) {
> + if (strcmp(str_access, "") == 0)
> + continue;
> + else if (strcmp(str_access, "r") == 0)
> + *quiet_access |= ACCESS_FS_ROUGHLY_READ;
> + else if (strcmp(str_access, "w") == 0)
> + *quiet_access |= ACCESS_FS_ROUGHLY_WRITE;
> + else if (strcmp(str_access, "b") == 0)
> + *quiet_access |= LANDLOCK_ACCESS_NET_BIND_TCP;
> + else if (strcmp(str_access, "c") == 0)
> + *quiet_access |= LANDLOCK_ACCESS_NET_CONNECT_TCP;
> + else if (strcmp(str_access, "ub") == 0)
> + *quiet_access |= LANDLOCK_ACCESS_NET_BIND_UDP;
> + else if (strcmp(str_access, "uc") == 0)
> + *quiet_access |= LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP;
> + else if (strcmp(str_access, "a") == 0)
> + *quiet_access |= LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET;
> + else if (strcmp(str_access, "s") == 0)
> + *quiet_access |= LANDLOCK_SCOPE_SIGNAL;
You don't need to do it in this patch but these strings should probably
be centrally defined somewhere... as we add more they could be easy to
mix up.
> + else {
> + fprintf(stderr, "Unknown quiet access \"%s\"\n",
> + str_access);
> + free(env_quiet_access);
> + return -1;
> + }
> + }
> +
> + free(env_quiet_access);
> + *quiet_access &= handled_access;
> + return 0;
> +}
> +
> #define LANDLOCK_ABI_LAST 10
>
> #define XSTR(s) #s
> @@ -336,6 +393,22 @@ static const char help[] =
> "\n"
> "A sandboxer should not log denied access requests to avoid spamming logs, "
> "but to test audit we can set " ENV_FORCE_LOG_NAME "=1\n"
> + ENV_FS_QUIET_NAME " and " ENV_NET_QUIET_NAME ", both optional, can then be used "
> + "to make access to some denied paths or network ports not trigger audit logging.\n"
> + ENV_FS_QUIET_ACCESS_NAME " and " ENV_NET_QUIET_ACCESS_NAME " can be used to specify "
> + "which accesses should be quieted (defaults to all):\n"
> + "* " ENV_FS_QUIET_ACCESS_NAME ": file system accesses to quiet\n"
> + " - \"r\" to quiet all file/dir read accesses\n"
> + " - \"w\" to quiet all file/dir write accesses\n"
> + "* " ENV_NET_QUIET_ACCESS_NAME ": network accesses to quiet\n"
> + " - \"b\" to quiet tcp bind denials\n"
> + " - \"c\" to quiet tcp connect denials\n"
> + " - \"ub\" to quiet udp bind denials\n"
> + " - \"uc\" to quiet udp connect / send denials\n"
> + "In addition, " ENV_SCOPED_QUIET_ACCESS_NAME " can be set to quiet all denials for "
> + "scoped actions (defaults to none).\n"
> + " - \"a\" to quiet abstract unix socket denials\n"
> + " - \"s\" to quiet signal denials\n"
> "\n"
> "Example:\n"
> ENV_FS_RO_NAME "=\"${PATH}:/lib:/usr:/proc:/etc:/dev/urandom\" "
> @@ -368,7 +441,12 @@ int main(const int argc, char *const argv[], char *const *const envp)
> LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
> .scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> LANDLOCK_SCOPE_SIGNAL,
> + .quiet_access_fs = 0,
> + .quiet_access_net = 0,
> + .quiet_scoped = 0,
> };
> +
> + bool quiet_supported = true;
> int supported_restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON;
> int set_restrict_flags = 0;
>
> @@ -459,6 +537,9 @@ int main(const int argc, char *const argv[], char *const *const envp)
> ruleset_attr.handled_access_net &=
> ~(LANDLOCK_ACCESS_NET_BIND_UDP |
> LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
> + __attribute__((fallthrough));
The fallthrough should be the last statement in the switch case;
otherwise this causes a build warning.
> + /* Don't add quiet flags for ABI < 10 later on. */
> + quiet_supported = false;
>
> /* Must be printed for any ABI < LANDLOCK_ABI_LAST. */
> fprintf(stderr,
> @@ -525,6 +606,25 @@ int main(const int argc, char *const argv[], char *const *const envp)
> unsetenv(ENV_FORCE_LOG_NAME);
> }
>
> + /*
> + * Add quiet for fs/net handled access bits. Doing this alone has no
> + * effect unless we later add quiet rules per FS_QUIET/NET_QUIET.
> + */
> + if (quiet_supported) {
> + if (add_quiet_access(&ruleset_attr.quiet_access_fs,
> + ruleset_attr.handled_access_fs,
> + ENV_FS_QUIET_ACCESS_NAME, true))
> + return 1;
> + if (add_quiet_access(&ruleset_attr.quiet_access_net,
> + ruleset_attr.handled_access_net,
> + ENV_NET_QUIET_ACCESS_NAME, true))
> + return 1;
> + if (add_quiet_access(&ruleset_attr.quiet_scoped,
> + ruleset_attr.scoped,
> + ENV_SCOPED_QUIET_ACCESS_NAME, false))
> + return 1;
> + }
> +
> ruleset_fd =
> landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
> if (ruleset_fd < 0) {
> @@ -532,30 +632,42 @@ int main(const int argc, char *const argv[], char *const *const envp)
> return 1;
> }
>
> - if (populate_ruleset_fs(ENV_FS_RO_NAME, ruleset_fd, access_fs_ro)) {
> + if (populate_ruleset_fs(ENV_FS_RO_NAME, ruleset_fd, access_fs_ro, 0))
> goto err_close_ruleset;
> - }
> - if (populate_ruleset_fs(ENV_FS_RW_NAME, ruleset_fd, access_fs_rw)) {
> + if (populate_ruleset_fs(ENV_FS_RW_NAME, ruleset_fd, access_fs_rw, 0))
> goto err_close_ruleset;
> +
> + /* Don't require this env to be present. */
> + if (quiet_supported && getenv(ENV_FS_QUIET_NAME)) {
> + if (populate_ruleset_fs(ENV_FS_QUIET_NAME, ruleset_fd, 0,
> + LANDLOCK_ADD_RULE_QUIET))
> + goto err_close_ruleset;
> }
>
> if (populate_ruleset_net(ENV_TCP_BIND_NAME, ruleset_fd,
> - LANDLOCK_ACCESS_NET_BIND_TCP)) {
> + LANDLOCK_ACCESS_NET_BIND_TCP, 0)) {
> goto err_close_ruleset;
> }
> if (populate_ruleset_net(ENV_TCP_CONNECT_NAME, ruleset_fd,
> - LANDLOCK_ACCESS_NET_CONNECT_TCP)) {
> + LANDLOCK_ACCESS_NET_CONNECT_TCP, 0)) {
> goto err_close_ruleset;
> }
> if (populate_ruleset_net(ENV_UDP_BIND_NAME, ruleset_fd,
> - LANDLOCK_ACCESS_NET_BIND_UDP)) {
> + LANDLOCK_ACCESS_NET_BIND_UDP, 0)) {
> goto err_close_ruleset;
> }
> if (populate_ruleset_net(ENV_UDP_CONNECT_SEND_NAME, ruleset_fd,
> - LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP)) {
> + LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP, 0)) {
> goto err_close_ruleset;
> }
>
> + if (quiet_supported) {
> + if (populate_ruleset_net(ENV_NET_QUIET_NAME, ruleset_fd, 0,
> + LANDLOCK_ADD_RULE_QUIET)) {
> + goto err_close_ruleset;
> + }
> + }
> +
> if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
> perror("Failed to restrict privileges");
> goto err_close_ruleset;
> --
> 2.54.0
^ permalink raw reply
* [PATCH v8 10/10] landlock: Add KUnit tests for LANDLOCK_ADD_RULE_NO_INHERIT
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
Add the landlock_ruleset KUnit suite with five tests for the
no_inherit handling in landlock_unmask_layers():
- test_unmask_no_inherit_propagates: a rule with no_inherit unmasks
access and sets the no_inherit bit on the layer mask.
- test_unmask_no_inherit_skip: a layer with no_inherit already set in
the mask is skipped (no access removal).
- test_unmask_no_inherit_both_set: when both rule and mask have
no_inherit, the skip still happens and the bit stays set.
- test_unmask_multilayer_no_inherit: no_inherit on one layer of a
multi-layer rule only affects that layer.
- test_unmask_no_inherit_sequential: applying a descendant rule
(no_inherit) followed by an ancestor rule causes the ancestor to be
skipped, modeling a path walk.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Renamed patch from 'Implement KUnit test' to 'Add KUnit tests'
(now plural).
* Replaced the single test_unmask_layers_no_inherit() case with
five focused tests aligned with the new per-layer no_inherit
bit added to struct layer_mask in patch 6:
- test_unmask_no_inherit_propagates
- test_unmask_no_inherit_skip
- test_unmask_no_inherit_both_set
- test_unmask_multilayer_no_inherit
- test_unmask_no_inherit_sequential
* Added alloc_rule() and fill_masks() helpers to share setup
across the new tests.
security/landlock/ruleset.c | 182 ++++++++++++++++++++++++++++++++++++
1 file changed, 182 insertions(+)
diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index c78e2b2d73ff..9f47d106aca3 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -6,6 +6,7 @@
* Copyright © 2018-2020 ANSSI
*/
+#include <kunit/test.h>
#include <linux/bits.h>
#include <linux/bug.h>
#include <linux/cleanup.h>
@@ -766,3 +767,184 @@ landlock_init_layer_masks(const struct landlock_ruleset *const domain,
return handled_accesses;
}
+
+#ifdef CONFIG_SECURITY_LANDLOCK_KUNIT_TEST
+
+/*
+ * Helper to allocate a rule with @num_layers layers and initialize
+ * its num_layers field. Caller must fill in individual layers.
+ */
+static struct landlock_rule *alloc_rule(struct kunit *test, u32 num_layers)
+{
+ struct landlock_rule *rule;
+
+ rule = kzalloc(struct_size(rule, layers, num_layers), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, rule);
+ rule->num_layers = num_layers;
+ return rule;
+}
+
+/*
+ * Build a layer_masks with the first @num_layers layers' access set to
+ * @val, and all no_inherit flags cleared. Layers beyond @num_layers stay
+ * zeroed, matching what landlock_init_layer_masks() produces for a domain
+ * with that many layers.
+ */
+static void fill_masks(struct layer_masks *masks, access_mask_t val,
+ size_t num_layers)
+{
+ memset(masks, 0, sizeof(*masks));
+ for (size_t i = 0; i < num_layers; i++)
+ masks->layers[i].access = val;
+}
+
+/* Verify that a rule with no_inherit unmasks access and propagates the flag. */
+static void test_unmask_no_inherit_propagates(struct kunit *const test)
+{
+ struct landlock_rule *rule = alloc_rule(test, 1);
+ struct layer_masks masks;
+ const access_mask_t req = BIT_ULL(0) | BIT_ULL(1);
+
+ rule->layers[0].level = 1;
+ rule->layers[0].access = BIT_ULL(0);
+ rule->layers[0].flags.no_inherit = true;
+
+ fill_masks(&masks, req, 1);
+ landlock_unmask_layers(rule, &masks);
+
+ /* access bit 0 should be cleared, bit 1 remains */
+ KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access,
+ BIT_ULL(1));
+ KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+ KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[1].access, 0);
+ kfree(rule);
+}
+
+/* Verify that a pre-set no_inherit in the mask causes the layer to be skipped. */
+static void test_unmask_no_inherit_skip(struct kunit *const test)
+{
+ struct landlock_rule *rule = alloc_rule(test, 1);
+ struct layer_masks masks;
+ const access_mask_t req = BIT_ULL(0);
+
+ rule->layers[0].level = 1;
+ rule->layers[0].access = BIT_ULL(0);
+
+ fill_masks(&masks, req, 1);
+ masks.layers[0].no_inherit = true;
+ landlock_unmask_layers(rule, &masks);
+
+ /* bit 0 should NOT be cleared because layer was skipped */
+ KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access, req);
+ KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+ kfree(rule);
+}
+
+/*
+ * Verify that no_inherit on the rule is still set when the mask already
+ * has no_inherit (the skip prevents access removal but the flag propagates).
+ */
+static void test_unmask_no_inherit_both_set(struct kunit *const test)
+{
+ struct landlock_rule *rule = alloc_rule(test, 1);
+ struct layer_masks masks;
+ const access_mask_t req = BIT_ULL(0);
+
+ rule->layers[0].level = 1;
+ rule->layers[0].access = BIT_ULL(0);
+ rule->layers[0].flags.no_inherit = true;
+
+ fill_masks(&masks, req, 1);
+ masks.layers[0].no_inherit = true;
+ landlock_unmask_layers(rule, &masks);
+
+ KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access, req);
+ KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+ kfree(rule);
+}
+
+/*
+ * Verify that no_inherit on layer 1 of a multi-layer rule only affects
+ * layer 1; layer 2 still contributes normally.
+ */
+static void test_unmask_multilayer_no_inherit(struct kunit *const test)
+{
+ struct landlock_rule *rule = alloc_rule(test, 2);
+ struct layer_masks masks;
+ const access_mask_t req = BIT_ULL(0) | BIT_ULL(1);
+
+ rule->layers[0].level = 1;
+ rule->layers[0].access = BIT_ULL(0);
+ rule->layers[0].flags.no_inherit = true;
+
+ rule->layers[1].level = 2;
+ rule->layers[1].access = BIT_ULL(1);
+
+ fill_masks(&masks, req, 2);
+ landlock_unmask_layers(rule, &masks);
+
+ /* Layer 1: bit 0 cleared, no_inherit set */
+ KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access, BIT_ULL(1));
+ KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+
+ /* Layer 2: bit 1 cleared, no_inherit not set */
+ KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[1].access, BIT_ULL(0));
+ KUNIT_EXPECT_FALSE(test, masks.layers[1].no_inherit);
+ kfree(rule);
+}
+
+/*
+ * Verify that when applying two rules sequentially (as happens during
+ * a path walk), no_inherit from the first rule prevents the second
+ * rule from contributing to that layer.
+ */
+static void test_unmask_no_inherit_sequential(struct kunit *const test)
+{
+ struct landlock_rule *rule1 = alloc_rule(test, 1);
+ struct landlock_rule *rule2 = alloc_rule(test, 1);
+ struct layer_masks masks;
+ const access_mask_t req = BIT_ULL(0) | BIT_ULL(1);
+
+ /* Rule 1: no_inherit on layer 1, grants access bit 0 */
+ rule1->layers[0].level = 1;
+ rule1->layers[0].access = BIT_ULL(0);
+ rule1->layers[0].flags.no_inherit = true;
+
+ /* Rule 2: also on layer 1, grants access bit 1 (ancestor rule) */
+ rule2->layers[0].level = 1;
+ rule2->layers[0].access = BIT_ULL(1);
+
+ /* Apply rule1 first (descendant), then rule2 (ancestor) */
+ fill_masks(&masks, req, 1);
+ landlock_unmask_layers(rule1, &masks);
+ landlock_unmask_layers(rule2, &masks);
+
+ /*
+ * Rule2 should be skipped because rule1 set no_inherit.
+ * bit 0 cleared by rule1, bit 1 remains because rule2 skipped.
+ */
+ KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access, BIT_ULL(1));
+ KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+ kfree(rule1);
+ kfree(rule2);
+}
+
+/* clang-format off */
+static struct kunit_case test_cases[] = {
+ KUNIT_CASE(test_unmask_no_inherit_propagates),
+ KUNIT_CASE(test_unmask_no_inherit_skip),
+ KUNIT_CASE(test_unmask_no_inherit_both_set),
+ KUNIT_CASE(test_unmask_multilayer_no_inherit),
+ KUNIT_CASE(test_unmask_no_inherit_sequential),
+ {}
+};
+/* clang-format on */
+
+static struct kunit_suite test_suite = {
+ .name = "landlock_ruleset",
+ .test_cases = test_cases,
+};
+
+kunit_test_suite(test_suite);
+
+#endif /* CONFIG_SECURITY_LANDLOCK_KUNIT_TEST */
--
2.53.0
^ permalink raw reply related
* [PATCH v8 09/10] selftests/landlock: Add selftests for LANDLOCK_ADD_RULE_NO_INHERIT
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
Add test coverage for the new flag:
- New layout1_no_inherit fixture with five variants covering NO_INHERIT
on leaf, middle, and root directories, RW-over-RO expansion, and a
regular file target. Three tests per variant exercise inheritance
blocking, topology sealing, and layered (multi-domain) NO_INHERIT.
- A new layout4_disconnected_leafs variant exercising NO_INHERIT applied
through a bind mount, asserting that ancestors in both the bind and
source paths are sealed.
- A new audit_no_inherit fixture verifying that the flag interacts
correctly with the quiet flag: a quiet ancestor does not suppress
audit on a descendant that has crossed a NO_INHERIT boundary.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Reorganized the new fs_test.c coverage around fixtures and
variants instead of one TEST_F_FORK per scenario:
- New layout1_no_inherit fixture with five FIXTURE_VARIANT_ADD
cases (rw_parent_ro_leaf, rw_parent_ro_middle, rw_parent_ro_root,
ro_parent_rw_middle, rw_parent_read_file) collapse what were
eight near-duplicate layout1 tests in v7 into three shared
tests (blocks_inheritance, seals_topology, layered_no_inherit).
- New layout4_disconnected_leafs variant 'no_inherit_mount' with a
single 'no_inherit_seals_mount' test replaces the four
v7 layout4 tests (no_inherit_mount_parent_{rename,rmdir,link}
and no_inherit_source_parent_rename) by exercising all four
sealed topology operations in one test.
- New audit_no_inherit fixture with three variants
(parent_is_logged, blocks_quiet_inheritance, quiet_parent)
covers the quiet/no_inherit interaction previously inlined
into an ad hoc audit test.
* Net change: 705 added lines in v7 -> 419 added lines in v8, with
equivalent coverage.
tools/testing/selftests/landlock/fs_test.c | 419 +++++++++++++++++++++
1 file changed, 419 insertions(+)
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 2e32295258f9..625ff1afecb0 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -1429,6 +1429,224 @@ TEST_F_FORK(layout1, inherit_superset)
ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
}
+FIXTURE(layout1_no_inherit) {};
+
+FIXTURE_SETUP(layout1_no_inherit)
+{
+ prepare_layout(_metadata);
+ create_layout1(_metadata);
+}
+
+FIXTURE_TEARDOWN_PARENT(layout1_no_inherit)
+{
+ remove_layout1(_metadata);
+ cleanup_layout(_metadata);
+}
+
+FIXTURE_VARIANT(layout1_no_inherit)
+{
+ const char *ni_path;
+ const __u64 ni_access;
+ const char *ni_file;
+ const char *desc_file;
+ const int expected_ni_write;
+ const int expected_ni_read;
+ const int expected_desc_write;
+ const int expected_desc_read;
+};
+
+/* NO_INHERIT on leaf directory: blocks parent's RW, grants only RO. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, rw_parent_ro_leaf) {
+ /* clang-format on */
+ .ni_path = TMP_DIR "/s1d1/s1d2/s1d3",
+ .ni_access = ACCESS_RO,
+ .ni_file = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+ .desc_file = TMP_DIR "/s1d1/s1d2/s1d3/f2",
+ .expected_ni_write = EACCES,
+ .expected_ni_read = 0,
+ .expected_desc_write = EACCES,
+ .expected_desc_read = 0,
+};
+
+/* NO_INHERIT on middle directory: blocks parent's RW for all descendants. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, rw_parent_ro_middle) {
+ /* clang-format on */
+ .ni_path = TMP_DIR "/s1d1/s1d2",
+ .ni_access = ACCESS_RO,
+ .ni_file = TMP_DIR "/s1d1/s1d2/f1",
+ .desc_file = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+ .expected_ni_write = EACCES,
+ .expected_ni_read = 0,
+ .expected_desc_write = EACCES,
+ .expected_desc_read = 0,
+};
+
+/* NO_INHERIT on root directory: blocks parent's RW for entire subtree. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, rw_parent_ro_root) {
+ /* clang-format on */
+ .ni_path = TMP_DIR "/s1d1",
+ .ni_access = ACCESS_RO,
+ .ni_file = TMP_DIR "/s1d1/f1",
+ .desc_file = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+ .expected_ni_write = EACCES,
+ .expected_ni_read = 0,
+ .expected_desc_write = EACCES,
+ .expected_desc_read = 0,
+};
+
+/* NO_INHERIT with RW access expands parent's RO to RW. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, ro_parent_rw_middle) {
+ /* clang-format on */
+ .ni_path = TMP_DIR "/s1d1/s1d2",
+ .ni_access = ACCESS_RW,
+ .ni_file = TMP_DIR "/s1d1/s1d2/f1",
+ .desc_file = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+ .expected_ni_write = 0,
+ .expected_ni_read = 0,
+ .expected_desc_write = 0,
+ .expected_desc_read = 0,
+};
+
+/* NO_INHERIT on a file: file gets only its explicit READ_FILE access. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, rw_parent_read_file) {
+ /* clang-format on */
+ .ni_path = TMP_DIR "/s1d1/s1d2/f1",
+ .ni_access = LANDLOCK_ACCESS_FS_READ_FILE,
+ .ni_file = TMP_DIR "/s1d1/s1d2/f1",
+ .desc_file = TMP_DIR "/s1d1/s1d2/f2",
+ .expected_ni_write = EACCES,
+ .expected_ni_read = 0,
+ .expected_desc_write = 0,
+ .expected_desc_read = 0,
+};
+
+TEST_F_FORK(layout1_no_inherit, blocks_inheritance)
+{
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = ACCESS_RW,
+ };
+ int ruleset_fd;
+
+ /* RO variants: TMP_DIR gets RO instead of RW. */
+ if (variant->ni_access == ACCESS_RW)
+ ruleset_attr.handled_access_fs |= LANDLOCK_ACCESS_FS_READ_DIR;
+
+ ruleset_fd =
+ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+
+ if (variant->ni_access == ACCESS_RW)
+ add_path_beneath(_metadata, ruleset_fd, ACCESS_RO, TMP_DIR, 0);
+ else
+ add_path_beneath(_metadata, ruleset_fd, ACCESS_RW, TMP_DIR, 0);
+
+ add_path_beneath(_metadata, ruleset_fd, variant->ni_access,
+ variant->ni_path, LANDLOCK_ADD_RULE_NO_INHERIT);
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ EXPECT_EQ(variant->expected_ni_write,
+ test_open(variant->ni_file, O_WRONLY));
+ EXPECT_EQ(variant->expected_ni_read,
+ test_open(variant->ni_file, O_RDONLY));
+
+ if (variant->desc_file != variant->ni_file) {
+ EXPECT_EQ(variant->expected_desc_write,
+ test_open(variant->desc_file, O_WRONLY));
+ EXPECT_EQ(variant->expected_desc_read,
+ test_open(variant->desc_file, O_RDONLY));
+ }
+}
+
+TEST_F_FORK(layout1_no_inherit, seals_topology)
+{
+ int ruleset_fd;
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = ACCESS_RW | LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_REMOVE_FILE |
+ LANDLOCK_ACCESS_FS_REMOVE_DIR,
+ };
+
+ ruleset_fd =
+ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+
+ add_path_beneath(_metadata, ruleset_fd,
+ ACCESS_RW | LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_REMOVE_FILE |
+ LANDLOCK_ACCESS_FS_REMOVE_DIR,
+ TMP_DIR, 0);
+ add_path_beneath(_metadata, ruleset_fd, variant->ni_access,
+ variant->ni_path, LANDLOCK_ADD_RULE_NO_INHERIT);
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* The directory bearing NO_INHERIT cannot be renamed or removed. */
+ ASSERT_EQ(-1, rename(variant->ni_path, TMP_DIR "/ni_renamed"));
+ ASSERT_EQ(EACCES, errno);
+
+ /*
+ * Content inside the NO_INHERIT directory is still mutable
+ * (if the access rights permit it).
+ */
+ if (variant->ni_access & LANDLOCK_ACCESS_FS_REMOVE_FILE) {
+ ASSERT_EQ(0, unlink(variant->ni_file));
+ } else {
+ ASSERT_EQ(-1, unlink(variant->ni_file));
+ }
+
+ /* Unrelated operations outside the sealed branch still work. */
+ ASSERT_EQ(0, unlink(file1_s2d1));
+ ASSERT_EQ(0, mknod(file1_s2d1, S_IFREG | 0700, 0));
+}
+
+TEST_F_FORK(layout1_no_inherit, layered_no_inherit)
+{
+ const struct rule layer_rules[] = {
+ {
+ .path = TMP_DIR,
+ .access = ACCESS_RW | LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ },
+ {},
+ };
+ int ruleset_fd;
+
+ /* Layer 1: RW on TMP_DIR. */
+ ruleset_fd = create_ruleset(_metadata,
+ ACCESS_RW | LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ layer_rules);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Layer 2: NO_INHERIT on the target. */
+ ruleset_fd = create_ruleset(_metadata,
+ ACCESS_RW | LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ layer_rules);
+ ASSERT_LE(0, ruleset_fd);
+ add_path_beneath(_metadata, ruleset_fd, variant->ni_access,
+ variant->ni_path, LANDLOCK_ADD_RULE_NO_INHERIT);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* The target path cannot be renamed. */
+ ASSERT_EQ(-1, rename(variant->ni_path, TMP_DIR "/ni_renamed_layered"));
+ ASSERT_EQ(EACCES, errno);
+
+ /* Content at NI path respects the NO_INHERIT access from layer 2. */
+ EXPECT_EQ(variant->expected_ni_write,
+ test_open(variant->ni_file, O_WRONLY));
+ EXPECT_EQ(variant->expected_ni_read,
+ test_open(variant->ni_file, O_RDONLY));
+}
+
TEST_F_FORK(layout0, max_layers)
{
int i, err;
@@ -5571,6 +5789,25 @@ FIXTURE_VARIANT(layout4_disconnected_leafs)
const int expected_exchange_result;
/* Expected result of the call to renameat([fd:s1d42]/f4, [fd:s1d42]/f5). */
const int expected_same_dir_rename_result;
+
+ /*
+ * If true, a NO_INHERIT rule is set on s1d41 (via the bind mount
+ * at s2d2). Used by the no_inherit_mount test.
+ */
+ bool no_inherit_on_s1d41;
+ /*
+ * Access rights used for the optional NO_INHERIT rule on s1d41.
+ */
+ const __u64 no_inherit_access;
+ /*
+ * Expected result of renaming s1d31 (parent of s1d41 within the
+ * mount) when no_inherit_on_s1d41 is set.
+ */
+ const int expected_parent_rename;
+ /*
+ * Expected result of rmdir on s1d31, when no_inherit_on_s1d41 is set.
+ */
+ const int expected_parent_rmdir;
};
/* clang-format off */
@@ -5823,6 +6060,26 @@ FIXTURE_VARIANT_ADD(layout4_disconnected_leafs, f1_f2_f3) {
.expected_exchange_result = EACCES,
};
+/*
+ * NO_INHERIT variant: s1d41 is protected with ACCESS_RO via the bind mount.
+ * Parents within the mount are sealed against topology changes.
+ */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected_leafs, no_inherit_mount) {
+ /* clang-format on */
+ .allowed_f1 = LANDLOCK_ACCESS_FS_READ_FILE,
+ .allowed_f2 = LANDLOCK_ACCESS_FS_READ_FILE,
+ .allowed_f3 = LANDLOCK_ACCESS_FS_READ_FILE,
+ .expected_read_result = 0,
+ .expected_rename_result = EACCES,
+ .expected_exchange_result = EACCES,
+ .expected_same_dir_rename_result = EACCES,
+ .no_inherit_on_s1d41 = true,
+ .no_inherit_access = ACCESS_RO,
+ .expected_parent_rename = EACCES,
+ .expected_parent_rmdir = EACCES,
+};
+
TEST_F_FORK(layout4_disconnected_leafs, read_rename_exchange)
{
const __u64 handled_access =
@@ -5931,6 +6188,70 @@ TEST_F_FORK(layout4_disconnected_leafs, read_rename_exchange)
test_renameat(s1d42_bind_fd, "f4", s1d42_bind_fd, "f5"));
}
+/*
+ * When s1d41 (accessed via the bind mount at s2d2) is protected with
+ * NO_INHERIT, its parent directories within the mount are sealed from
+ * topology changes. Other variants do not exercise NO_INHERIT and skip
+ * this test.
+ */
+TEST_F_FORK(layout4_disconnected_leafs, no_inherit_seals_mount)
+{
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = ACCESS_RW | LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_REMOVE_FILE |
+ LANDLOCK_ACCESS_FS_REMOVE_DIR,
+ };
+ int ruleset_fd, s1d41_bind_fd;
+
+ if (!variant->no_inherit_on_s1d41)
+ SKIP(return, "variant does not set NO_INHERIT on s1d41");
+
+ ruleset_fd =
+ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+
+ add_path_beneath(_metadata, ruleset_fd,
+ ACCESS_RW | LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_REMOVE_FILE |
+ LANDLOCK_ACCESS_FS_REMOVE_DIR,
+ TMP_DIR, 0);
+
+ s1d41_bind_fd = open(TMP_DIR "/s2d1/s2d2/s1d31/s1d41",
+ O_DIRECTORY | O_PATH | O_CLOEXEC);
+ ASSERT_LE(0, s1d41_bind_fd);
+
+ ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &(struct landlock_path_beneath_attr){
+ .parent_fd = s1d41_bind_fd,
+ .allowed_access =
+ variant->no_inherit_access,
+ },
+ LANDLOCK_ADD_RULE_NO_INHERIT));
+ EXPECT_EQ(0, close(s1d41_bind_fd));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Parent of s1d41 within the mount is sealed. */
+ ASSERT_EQ(-1, rmdir(TMP_DIR "/s2d1/s2d2/s1d31"));
+ ASSERT_EQ(variant->expected_parent_rmdir, errno);
+
+ ASSERT_EQ(-1, rename(TMP_DIR "/s2d1/s2d2/s1d31",
+ TMP_DIR "/s2d1/s2d2/s1d31_renamed"));
+ ASSERT_EQ(variant->expected_parent_rename, errno);
+
+ /* Sibling directories outside the sealed chain are free. */
+ ASSERT_EQ(0, rename(TMP_DIR "/s2d1/s2d2/s1d32",
+ TMP_DIR "/s2d1/s2d2/s1d32_renamed"));
+ ASSERT_EQ(0, rename(TMP_DIR "/s2d1/s2d2/s1d32_renamed",
+ TMP_DIR "/s2d1/s2d2/s1d32"));
+
+ /* The mount source parent hierarchy is also sealed. */
+ ASSERT_EQ(-1, rename(TMP_DIR "/s1d1/s1d2/s1d31",
+ TMP_DIR "/s1d1/s1d2/s1d31_renamed"));
+ ASSERT_EQ(variant->expected_parent_rename, errno);
+}
+
/*
* layout5_disconnected_branch before rename:
*
@@ -7358,6 +7679,104 @@ TEST_F(audit_layout1, write_file)
EXPECT_EQ(1, records.domain);
}
+FIXTURE(audit_no_inherit)
+{
+ struct audit_filter audit_filter;
+ int audit_fd;
+};
+
+FIXTURE_SETUP(audit_no_inherit)
+{
+ prepare_layout(_metadata);
+ create_layout1(_metadata);
+
+ set_cap(_metadata, CAP_AUDIT_CONTROL);
+ self->audit_fd = audit_init_with_exe_filter(&self->audit_filter);
+ EXPECT_LE(0, self->audit_fd);
+ clear_cap(_metadata, CAP_AUDIT_CONTROL);
+}
+
+FIXTURE_TEARDOWN_PARENT(audit_no_inherit)
+{
+ remove_layout1(_metadata);
+ cleanup_layout(_metadata);
+
+ EXPECT_EQ(0, audit_cleanup(-1, NULL));
+}
+
+FIXTURE_VARIANT(audit_no_inherit)
+{
+ bool parent_quiet;
+ const char *test_path;
+ bool expect_audit_log;
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(audit_no_inherit, parent_is_logged) {
+ /* clang-format on */
+ .parent_quiet = false,
+ .test_path = TMP_DIR "/s1d1/s1d2/f1",
+ .expect_audit_log = true,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(audit_no_inherit, blocks_quiet_inheritance) {
+ /* clang-format on */
+ .parent_quiet = true,
+ .test_path = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+ .expect_audit_log = true,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(audit_no_inherit, quiet_parent) {
+ /* clang-format on */
+ .parent_quiet = true,
+ .test_path = TMP_DIR "/s1d1/f1",
+ .expect_audit_log = false,
+};
+
+TEST_F(audit_no_inherit, no_inherit_audit)
+{
+ struct audit_records records;
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = ACCESS_RW,
+ .quiet_access_fs = variant->parent_quiet ? ACCESS_RW : 0,
+ };
+ int ruleset_fd;
+
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+
+ if (variant->parent_quiet)
+ add_path_beneath(_metadata, ruleset_fd, ACCESS_RO, dir_s1d1,
+ LANDLOCK_ADD_RULE_QUIET);
+ else
+ add_path_beneath(_metadata, ruleset_fd, ACCESS_RO, dir_s1d1, 0);
+
+ add_path_beneath(_metadata, ruleset_fd, ACCESS_RO, dir_s1d3,
+ LANDLOCK_ADD_RULE_NO_INHERIT);
+
+ enforce_ruleset(_metadata, ruleset_fd);
+
+ EXPECT_EQ(EACCES, test_open(variant->test_path, O_WRONLY));
+ if (variant->expect_audit_log) {
+ EXPECT_EQ(0, matches_log_fs(_metadata, self->audit_fd,
+ "fs\\.write_file",
+ variant->test_path));
+ } else {
+ EXPECT_NE(0, matches_log_fs(_metadata, self->audit_fd,
+ "fs\\.write_file",
+ variant->test_path));
+ }
+
+ EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+ EXPECT_EQ(0, records.access);
+ EXPECT_EQ(variant->expect_audit_log ? 1 : 0, records.domain);
+
+ EXPECT_EQ(0, close(ruleset_fd));
+}
+
TEST_F(audit_layout1, read_file)
{
struct audit_records records;
--
2.53.0
^ permalink raw reply related
* [PATCH v8 08/10] samples/landlock: Add LANDLOCK_ADD_RULE_NO_INHERIT to landlock-sandboxer
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic
Cc: linux-kernel, linux-security-module, Justin Suess, Tingmao Wang
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
Add a new LL_FS_NO_INHERIT environment variable to the sandboxer.
Paths listed in it are added with the LANDLOCK_ADD_RULE_NO_INHERIT
flag, demonstrating how to set up a parent directory with broader
access than its children.
The flag is silently skipped on kernels older than ABI 10.
Cc: Tingmao Wang <m@maowtm.org>
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Reworded commit message.
* Updated the ABI fallthrough comment to mention 'quiet or
no_inherit flags'. No code change beyond the comment update.
samples/landlock/sandboxer.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
index 74ee53afed6a..b126ffd7cd4f 100644
--- a/samples/landlock/sandboxer.c
+++ b/samples/landlock/sandboxer.c
@@ -60,6 +60,7 @@ static inline int landlock_restrict_self(const int ruleset_fd,
#define ENV_FS_RW_NAME "LL_FS_RW"
#define ENV_FS_QUIET_NAME "LL_FS_QUIET"
#define ENV_FS_QUIET_ACCESS_NAME "LL_FS_QUIET_ACCESS"
+#define ENV_FS_NO_INHERIT_NAME "LL_FS_NO_INHERIT"
#define ENV_TCP_BIND_NAME "LL_TCP_BIND"
#define ENV_TCP_CONNECT_NAME "LL_TCP_CONNECT"
#define ENV_NET_QUIET_NAME "LL_NET_QUIET"
@@ -395,6 +396,7 @@ static const char help[] =
"but to test audit we can set " ENV_FORCE_LOG_NAME "=1\n"
ENV_FS_QUIET_NAME " and " ENV_NET_QUIET_NAME ", both optional, can then be used "
"to make access to some denied paths or network ports not trigger audit logging.\n"
+ ENV_FS_NO_INHERIT_NAME " can be used to suppress access right propagation (ABI >= 10).\n"
ENV_FS_QUIET_ACCESS_NAME " and " ENV_NET_QUIET_ACCESS_NAME " can be used to specify "
"which accesses should be quieted (defaults to all):\n"
"* " ENV_FS_QUIET_ACCESS_NAME ": file system accesses to quiet\n"
@@ -447,6 +449,7 @@ int main(const int argc, char *const argv[], char *const *const envp)
};
bool quiet_supported = true;
+ bool no_inherit_supported = true;
int supported_restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON;
int set_restrict_flags = 0;
@@ -538,8 +541,9 @@ int main(const int argc, char *const argv[], char *const *const envp)
~(LANDLOCK_ACCESS_NET_BIND_UDP |
LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
__attribute__((fallthrough));
- /* Don't add quiet flags for ABI < 10 later on. */
+ /* Don't add quiet or no_inherit flags for ABI < 10 later on. */
quiet_supported = false;
+ no_inherit_supported = false;
/* Must be printed for any ABI < LANDLOCK_ABI_LAST. */
fprintf(stderr,
@@ -644,6 +648,13 @@ int main(const int argc, char *const argv[], char *const *const envp)
goto err_close_ruleset;
}
+ /* Don't require this env to be present. */
+ if (no_inherit_supported && getenv(ENV_FS_NO_INHERIT_NAME)) {
+ if (populate_ruleset_fs(ENV_FS_NO_INHERIT_NAME, ruleset_fd, 0,
+ LANDLOCK_ADD_RULE_NO_INHERIT))
+ goto err_close_ruleset;
+ }
+
if (populate_ruleset_net(ENV_TCP_BIND_NAME, ruleset_fd,
LANDLOCK_ACCESS_NET_BIND_TCP, 0)) {
goto err_close_ruleset;
--
2.53.0
^ permalink raw reply related
* [PATCH v8 07/10] landlock: Add documentation for LANDLOCK_ADD_RULE_NO_INHERIT
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
Adds documentation of the flag to the userspace api, describing
the functionality of the flag and parent directory protections.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Minor wording polish in the new 'Filesystem inheritance
suppression' documentation section; no semantic change.
Documentation/userspace-api/landlock.rst | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
index 138d504cb498..ae3136461b18 100644
--- a/Documentation/userspace-api/landlock.rst
+++ b/Documentation/userspace-api/landlock.rst
@@ -733,6 +733,24 @@ struct landlock_ruleset_attr. It is also now possible to suppress audit logs
for scope accesses via the ``quiet_scoped`` field of struct
landlock_ruleset_attr.
+Filesystem inheritance suppression (ABI < 10)
+---------------------------------------------
+
+Starting with the Landlock ABI version 10, it is possible to prevent a
+directory or file from inheriting its parent's access grants by using the
+``LANDLOCK_ADD_RULE_NO_INHERIT`` flag passed to sys_landlock_add_rule().
+This is useful for policies where a parent directory needs broader access
+than its children.
+
+To mitigate sandbox-restart attacks, the tagged inode and all of its
+ancestors up to the VFS root cannot be removed, renamed, reparented, or
+linked into or out of other directories.
+
+Inheritance of access grants from descendants of an inode tagged with
+``LANDLOCK_ADD_RULE_NO_INHERIT`` is unaffected: such descendants continue
+to inherit from the tagged inode normally, unless they also carry this
+flag.
+
.. _kernel_support:
Kernel support
--
2.53.0
^ permalink raw reply related
* [PATCH v8 06/10] landlock: Implement LANDLOCK_ADD_RULE_NO_INHERIT
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
Make %LANDLOCK_ADD_RULE_NO_INHERIT actually enforce its semantics:
- Tag the new rule's layer with @no_inherit and
@has_no_inherit_descendant so landlock_unmask_layers() stops walking
up the hierarchy once it has been seen, and so the rule's own object
is sealed against topology changes.
- Walk from the rule's path up to the VFS root in
landlock_append_fs_rule(), inserting a zero-access rule on each
ancestor with @has_no_inherit_descendant set, so topology changes
(rename, rmdir, link, ...) on any ancestor are denied too.
- Add deny_no_inherit_topology_change(), called from
current_check_refer_path(), hook_path_unlink() and hook_path_rmdir()
to enforce the seal at the LSM hook layer.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Reworded commit message to describe the three pieces of the
implementation (per-layer tagging, ancestor walk, LSM hook
enforcement).
* Reworked ancestor sealing in landlock_append_fs_rule(): use the
new landlock_insert_rule() return value to tag flags directly on
the inserted ancestor rule, removing the prior find_rule() round
trip after insertion.
* Moved no_inherit / has_no_inherit_descendant tracking from a
separate collected_rule_flags struct into the existing
layer_mask struct as a single per-layer 'no_inherit' bit.
landlock_unmask_layers() now skips layers whose mask already has
no_inherit set, and landlock_init_layer_masks() clears the new
bit. The 'has_no_inherit_descendant' rule-layer flag is
auto-set on the rule's own object when LANDLOCK_ADD_RULE_NO_INHERIT
is passed, sealing it against topology changes without a separate
blank-rule insertion.
* Simplified deny_no_inherit_topology_change(): dropped the
override_layers accumulator (it was always 0 in practice) and
now just OR-collects sealed layers from no_inherit /
has_no_inherit_descendant.
* Updated kerneldoc comments on the new layer flags.
security/landlock/access.h | 4 ++
security/landlock/fs.c | 116 +++++++++++++++++++++++++++++++++++-
security/landlock/ruleset.c | 30 +++++++++-
security/landlock/ruleset.h | 13 ++++
4 files changed, 159 insertions(+), 4 deletions(-)
diff --git a/security/landlock/access.h b/security/landlock/access.h
index 61a17b568652..ab5c1e0bc25d 100644
--- a/security/landlock/access.h
+++ b/security/landlock/access.h
@@ -71,12 +71,16 @@ static_assert(sizeof(typeof_member(union access_masks_all, masks)) ==
*
* @quiet is used to store whether we have encountered a rule with the
* quiet flag for this layer, which will be used to control audit logging.
+ *
+ * @no_inherit is used to mark this layer as having a no_inherit rule, so
+ * that ancestor rules in the same layer do not contribute access rights.
*/
struct layer_mask {
access_mask_t access:LANDLOCK_NUM_ACCESS_MAX;
#ifdef CONFIG_AUDIT
bool quiet:1;
#endif /* CONFIG_AUDIT */
+ bool no_inherit:1;
};
/*
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index ee7d9f5d7ee5..3aa7d898efe1 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -364,6 +364,7 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
struct landlock_id id = {
.type = LANDLOCK_KEY_INODE,
};
+ struct path walker = *path;
/* Files only get access rights that make sense. */
if (!d_is_dir(path->dentry) &&
@@ -378,10 +379,47 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
id.key.object = get_inode_object(d_backing_inode(path->dentry));
if (IS_ERR(id.key.object))
return PTR_ERR(id.key.object);
+
mutex_lock(&ruleset->lock);
rule = landlock_insert_rule(ruleset, id, access_rights, flags);
- if (IS_ERR(rule))
+ if (IS_ERR(rule)) {
err = PTR_ERR(rule);
+ goto out_unlock;
+ }
+ if (!(flags & LANDLOCK_ADD_RULE_NO_INHERIT))
+ goto out_unlock;
+
+ /*
+ * Seal each ancestor up to the VFS root with a no-access rule
+ * tagged @has_no_inherit_descendant so that topology-changing
+ * operations (rename, rmdir, link, ...) on them are denied.
+ */
+ path_get(&walker);
+ while (landlock_walk_path_up(&walker) != LANDLOCK_WALK_STOP_REAL_ROOT) {
+ struct landlock_rule *ancestor_rule;
+ struct inode *const ancestor_inode =
+ d_backing_inode(walker.dentry);
+ struct landlock_id ancestor_id = {
+ .type = LANDLOCK_KEY_INODE,
+ .key.object = get_inode_object(ancestor_inode),
+ };
+
+ if (IS_ERR(ancestor_id.key.object)) {
+ err = PTR_ERR(ancestor_id.key.object);
+ break;
+ }
+ ancestor_rule = landlock_insert_rule(ruleset, ancestor_id, 0,
+ 0);
+ landlock_put_object(ancestor_id.key.object);
+ if (IS_ERR(ancestor_rule)) {
+ err = PTR_ERR(ancestor_rule);
+ break;
+ }
+ ancestor_rule->layers[0].flags.has_no_inherit_descendant = true;
+ }
+ path_put(&walker);
+
+out_unlock:
mutex_unlock(&ruleset->lock);
/*
* No need to check for an error because landlock_insert_rule()
@@ -1106,6 +1144,54 @@ collect_domain_accesses(const struct landlock_ruleset *const domain,
return ret;
}
+/**
+ * deny_no_inherit_topology_change - Deny topology changes on sealed paths
+ * @subject: Subject performing the operation.
+ * @dentry: Target of the topology modification.
+ *
+ * Returns -EACCES (and emits an audit record) if any of the subject's
+ * domain layers seal @dentry against topology changes: either @dentry
+ * itself has a %LANDLOCK_ADD_RULE_NO_INHERIT rule, or one of its
+ * descendants does (recorded via @has_no_inherit_descendant on the
+ * dentry's rule).
+ *
+ * Returns 0 otherwise.
+ */
+static int
+deny_no_inherit_topology_change(const struct landlock_cred_security *subject,
+ struct dentry *const dentry)
+{
+ unsigned long sealed_layers = 0;
+ const struct landlock_rule *rule;
+
+ if (WARN_ON_ONCE(!subject || !dentry || d_is_negative(dentry)))
+ return 0;
+
+ rule = find_rule(subject->domain, dentry);
+ if (!rule)
+ return 0;
+
+ for (size_t i = 0; i < rule->num_layers; i++) {
+ const struct landlock_layer *const layer = &rule->layers[i];
+
+ if (layer->flags.no_inherit ||
+ layer->flags.has_no_inherit_descendant)
+ sealed_layers |= BIT(layer->level - 1);
+ }
+ if (!sealed_layers)
+ return 0;
+
+ landlock_log_denial(subject, &(struct landlock_request) {
+ .type = LANDLOCK_REQUEST_FS_CHANGE_TOPOLOGY,
+ .audit = {
+ .type = LSM_AUDIT_DATA_DENTRY,
+ .u.dentry = dentry,
+ },
+ .layer_plus_one = __ffs(sealed_layers) + 1,
+ });
+ return -EACCES;
+}
+
/**
* current_check_refer_path - Check if a rename or link action is allowed
*
@@ -1188,6 +1274,16 @@ static int current_check_refer_path(struct dentry *const old_dentry,
access_request_parent2 =
get_mode_access(d_backing_inode(old_dentry)->i_mode);
if (removable) {
+ int err = deny_no_inherit_topology_change(subject, old_dentry);
+
+ if (err)
+ return err;
+ if (exchange) {
+ err = deny_no_inherit_topology_change(subject,
+ new_dentry);
+ if (err)
+ return err;
+ }
access_request_parent1 |= maybe_remove(old_dentry);
access_request_parent2 |= maybe_remove(new_dentry);
}
@@ -1579,12 +1675,30 @@ static int hook_path_symlink(const struct path *const dir,
static int hook_path_unlink(const struct path *const dir,
struct dentry *const dentry)
{
+ const struct landlock_cred_security *const subject =
+ landlock_get_applicable_subject(current_cred(), any_fs, NULL);
+
+ if (subject) {
+ int err = deny_no_inherit_topology_change(subject, dentry);
+
+ if (err)
+ return err;
+ }
return current_check_access_path(dir, LANDLOCK_ACCESS_FS_REMOVE_FILE);
}
static int hook_path_rmdir(const struct path *const dir,
struct dentry *const dentry)
{
+ const struct landlock_cred_security *const subject =
+ landlock_get_applicable_subject(current_cred(), any_fs, NULL);
+
+ if (subject) {
+ int err = deny_no_inherit_topology_change(subject, dentry);
+
+ if (err)
+ return err;
+ }
return current_check_access_path(dir, LANDLOCK_ACCESS_FS_REMOVE_DIR);
}
diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index 48397ab43a2d..c78e2b2d73ff 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -258,6 +258,10 @@ insert_rule(struct landlock_ruleset *const ruleset,
return ERR_PTR(-EINVAL);
this->layers[0].access |= (*layers)[0].access;
this->layers[0].flags.quiet |= (*layers)[0].flags.quiet;
+ this->layers[0].flags.no_inherit |=
+ (*layers)[0].flags.no_inherit;
+ this->layers[0].flags.has_no_inherit_descendant |=
+ (*layers)[0].flags.has_no_inherit_descendant;
return this;
}
@@ -311,12 +315,20 @@ landlock_insert_rule(struct landlock_ruleset *const ruleset,
const struct landlock_id id,
const access_mask_t access, const int flags)
{
+ const bool no_inherit = !!(flags & LANDLOCK_ADD_RULE_NO_INHERIT);
struct landlock_layer layers[] = { {
.access = access,
/* When @level is zero, insert_rule() extends @ruleset. */
.level = 0,
.flags = {
.quiet = !!(flags & LANDLOCK_ADD_RULE_QUIET),
+ .no_inherit = no_inherit,
+ /*
+ * The rule's own object is also sealed against
+ * topology changes, so mark it as if it had a
+ * no-inherit descendant.
+ */
+ .has_no_inherit_descendant = no_inherit,
},
} };
@@ -657,15 +669,25 @@ bool landlock_unmask_layers(const struct landlock_rule *const rule,
*/
for (size_t i = 0; i < rule->num_layers; i++) {
const struct landlock_layer *const layer = &rule->layers[i];
+ struct layer_mask *const layer_mask =
+ &masks->layers[layer->level - 1];
+
+ /*
+ * Skip layers that already have no_inherit set: these layers
+ * should not inherit access rights from ancestor directories.
+ */
+ if (layer_mask->no_inherit)
+ continue;
/* Clear the bits where the layer in the rule grants access. */
- masks->layers[layer->level - 1].access &= ~layer->access;
+ layer_mask->access &= ~layer->access;
#ifdef CONFIG_AUDIT
- /* Collect rule flags for each layer. */
if (layer->flags.quiet)
- masks->layers[layer->level - 1].quiet = true;
+ layer_mask->quiet = true;
#endif /* CONFIG_AUDIT */
+ if (layer->flags.no_inherit)
+ layer_mask->no_inherit = true;
}
for (size_t i = 0; i < ARRAY_SIZE(masks->layers); i++) {
@@ -731,6 +753,7 @@ landlock_init_layer_masks(const struct landlock_ruleset *const domain,
#ifdef CONFIG_AUDIT
masks->layers[i].quiet = false;
#endif /* CONFIG_AUDIT */
+ masks->layers[i].no_inherit = false;
}
for (size_t i = domain->num_layers; i < ARRAY_SIZE(masks->layers);
i++) {
@@ -738,6 +761,7 @@ landlock_init_layer_masks(const struct landlock_ruleset *const domain,
#ifdef CONFIG_AUDIT
masks->layers[i].quiet = false;
#endif /* CONFIG_AUDIT */
+ masks->layers[i].no_inherit = false;
}
return handled_accesses;
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index 5b7f554e8442..249a736248db 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -40,6 +40,19 @@ struct landlock_layer {
* down the file hierarchy.
*/
bool quiet:1;
+ /**
+ * @no_inherit: Prevents this rule from inheriting access rights
+ * from ancestor inodes. Only used for filesystem rules; set
+ * via %LANDLOCK_ADD_RULE_NO_INHERIT.
+ */
+ bool no_inherit:1;
+ /**
+ * @has_no_inherit_descendant: Marker used to deny topology
+ * changes on the rule's object: either the object itself has
+ * a no-inherit rule, or a descendant does. Only used for
+ * filesystem rules; set by Landlock, never by user space.
+ */
+ bool has_no_inherit_descendant:1;
} flags;
/**
* @access: Bitfield of allowed actions on the kernel object. They are
--
2.53.0
^ permalink raw reply related
* [PATCH v8 05/10] landlock: Return inserted rule from landlock_insert_rule()
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
Change insert_rule() and landlock_insert_rule() to return the inserted
(or updated) struct landlock_rule pointer instead of an int errno.
Errors are propagated via ERR_PTR().
This gives callers a handle on the resulting rule so a subsequent change
can mutate per-layer flags on it (e.g. to mark ancestor rules created
for no-inherit topology sealing).
No functional change intended.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Replaced the v7 "Move find_rule definition above
landlock_append_fs_rule" patch with this new preparatory patch.
Instead of moving find_rule(), make landlock_insert_rule() (and
its static insert_rule() helper) return the inserted struct
landlock_rule * via ERR_PTR(), so callers can directly tag flags
on the resulting rule. Callers in net.c, merge_tree(), and
inherit_tree() updated accordingly. No functional change.
security/landlock/fs.c | 7 ++--
security/landlock/net.c | 8 +++--
security/landlock/ruleset.c | 68 ++++++++++++++++++-------------------
security/landlock/ruleset.h | 7 ++--
4 files changed, 48 insertions(+), 42 deletions(-)
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 6552351e0b9c..ee7d9f5d7ee5 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -359,7 +359,8 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
const struct path *const path,
access_mask_t access_rights, const int flags)
{
- int err;
+ int err = 0;
+ struct landlock_rule *rule;
struct landlock_id id = {
.type = LANDLOCK_KEY_INODE,
};
@@ -378,7 +379,9 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
if (IS_ERR(id.key.object))
return PTR_ERR(id.key.object);
mutex_lock(&ruleset->lock);
- err = landlock_insert_rule(ruleset, id, access_rights, flags);
+ rule = landlock_insert_rule(ruleset, id, access_rights, flags);
+ if (IS_ERR(rule))
+ err = PTR_ERR(rule);
mutex_unlock(&ruleset->lock);
/*
* No need to check for an error because landlock_insert_rule()
diff --git a/security/landlock/net.c b/security/landlock/net.c
index 60894cff973e..f08be4be275a 100644
--- a/security/landlock/net.c
+++ b/security/landlock/net.c
@@ -23,11 +23,11 @@ int landlock_append_net_rule(struct landlock_ruleset *const ruleset,
const u16 port, access_mask_t access_rights,
const int flags)
{
- int err;
const struct landlock_id id = {
.key.data = (__force uintptr_t)htons(port),
.type = LANDLOCK_KEY_NET_PORT,
};
+ struct landlock_rule *rule;
BUILD_BUG_ON(sizeof(port) > sizeof(id.key.data));
@@ -36,10 +36,12 @@ int landlock_append_net_rule(struct landlock_ruleset *const ruleset,
~landlock_get_net_access_mask(ruleset, 0);
mutex_lock(&ruleset->lock);
- err = landlock_insert_rule(ruleset, id, access_rights, flags);
+ rule = landlock_insert_rule(ruleset, id, access_rights, flags);
mutex_unlock(&ruleset->lock);
- return err;
+ if (IS_ERR(rule))
+ return PTR_ERR(rule);
+ return 0;
}
static int current_check_access_socket(struct socket *const sock,
diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index f01c3e14e55d..48397ab43a2d 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -203,12 +203,13 @@ static void build_check_ruleset(void)
* added to @ruleset as new constraints, similarly to a boolean AND between
* access rights.
*
- * Return: 0 on success, -errno on failure.
+ * Return: A pointer to the inserted or updated rule, or an ERR_PTR on failure.
*/
-static int insert_rule(struct landlock_ruleset *const ruleset,
- const struct landlock_id id,
- const struct landlock_layer (*layers)[],
- const size_t num_layers)
+static struct landlock_rule *
+insert_rule(struct landlock_ruleset *const ruleset,
+ const struct landlock_id id,
+ const struct landlock_layer (*layers)[],
+ const size_t num_layers)
{
struct rb_node **walker_node;
struct rb_node *parent_node = NULL;
@@ -218,14 +219,14 @@ static int insert_rule(struct landlock_ruleset *const ruleset,
might_sleep();
lockdep_assert_held(&ruleset->lock);
if (WARN_ON_ONCE(!layers))
- return -ENOENT;
+ return ERR_PTR(-ENOENT);
if (is_object_pointer(id.type) && WARN_ON_ONCE(!id.key.object))
- return -ENOENT;
+ return ERR_PTR(-ENOENT);
root = get_root(ruleset, id.type);
if (IS_ERR(root))
- return PTR_ERR(root);
+ return ERR_CAST(root);
walker_node = &root->rb_node;
while (*walker_node) {
@@ -243,7 +244,7 @@ static int insert_rule(struct landlock_ruleset *const ruleset,
/* Only a single-level layer should match an existing rule. */
if (WARN_ON_ONCE(num_layers != 1))
- return -EINVAL;
+ return ERR_PTR(-EINVAL);
/* If there is a matching rule, updates it. */
if ((*layers)[0].level == 0) {
@@ -252,16 +253,16 @@ static int insert_rule(struct landlock_ruleset *const ruleset,
* landlock_add_rule(2), i.e. @ruleset is not a domain.
*/
if (WARN_ON_ONCE(this->num_layers != 1))
- return -EINVAL;
+ return ERR_PTR(-EINVAL);
if (WARN_ON_ONCE(this->layers[0].level != 0))
- return -EINVAL;
+ return ERR_PTR(-EINVAL);
this->layers[0].access |= (*layers)[0].access;
this->layers[0].flags.quiet |= (*layers)[0].flags.quiet;
- return 0;
+ return this;
}
if (WARN_ON_ONCE(this->layers[0].level == 0))
- return -EINVAL;
+ return ERR_PTR(-EINVAL);
/*
* Intersects access rights when it is a merge between a
@@ -270,23 +271,23 @@ static int insert_rule(struct landlock_ruleset *const ruleset,
new_rule = create_rule(id, &this->layers, this->num_layers,
&(*layers)[0]);
if (IS_ERR(new_rule))
- return PTR_ERR(new_rule);
+ return ERR_CAST(new_rule);
rb_replace_node(&this->node, &new_rule->node, root);
free_rule(this, id.type);
- return 0;
+ return new_rule;
}
/* There is no match for @id. */
build_check_ruleset();
if (ruleset->num_rules >= LANDLOCK_MAX_NUM_RULES)
- return -E2BIG;
+ return ERR_PTR(-E2BIG);
new_rule = create_rule(id, layers, num_layers, NULL);
if (IS_ERR(new_rule))
- return PTR_ERR(new_rule);
+ return ERR_CAST(new_rule);
rb_link_node(&new_rule->node, parent_node, walker_node);
rb_insert_color(&new_rule->node, root);
ruleset->num_rules++;
- return 0;
+ return new_rule;
}
static void build_check_layer(void)
@@ -305,9 +306,10 @@ static void build_check_layer(void)
}
/* @ruleset must be locked by the caller. */
-int landlock_insert_rule(struct landlock_ruleset *const ruleset,
- const struct landlock_id id,
- const access_mask_t access, const int flags)
+struct landlock_rule *
+landlock_insert_rule(struct landlock_ruleset *const ruleset,
+ const struct landlock_id id,
+ const access_mask_t access, const int flags)
{
struct landlock_layer layers[] = { {
.access = access,
@@ -326,9 +328,8 @@ static int merge_tree(struct landlock_ruleset *const dst,
struct landlock_ruleset *const src,
const enum landlock_key_type key_type)
{
- struct landlock_rule *walker_rule, *next_rule;
+ struct landlock_rule *walker_rule, *next_rule, *rule;
struct rb_root *src_root;
- int err = 0;
might_sleep();
lockdep_assert_held(&dst->lock);
@@ -358,11 +359,11 @@ static int merge_tree(struct landlock_ruleset *const dst,
layers[0].access = walker_rule->layers[0].access;
layers[0].flags = walker_rule->layers[0].flags;
- err = insert_rule(dst, id, &layers, ARRAY_SIZE(layers));
- if (err)
- return err;
+ rule = insert_rule(dst, id, &layers, ARRAY_SIZE(layers));
+ if (IS_ERR(rule))
+ return PTR_ERR(rule);
}
- return err;
+ return 0;
}
static int merge_ruleset(struct landlock_ruleset *const dst,
@@ -412,9 +413,8 @@ static int inherit_tree(struct landlock_ruleset *const parent,
struct landlock_ruleset *const child,
const enum landlock_key_type key_type)
{
- struct landlock_rule *walker_rule, *next_rule;
+ struct landlock_rule *walker_rule, *next_rule, *rule;
struct rb_root *parent_root;
- int err = 0;
might_sleep();
lockdep_assert_held(&parent->lock);
@@ -432,12 +432,12 @@ static int inherit_tree(struct landlock_ruleset *const parent,
.type = key_type,
};
- err = insert_rule(child, id, &walker_rule->layers,
- walker_rule->num_layers);
- if (err)
- return err;
+ rule = insert_rule(child, id, &walker_rule->layers,
+ walker_rule->num_layers);
+ if (IS_ERR(rule))
+ return PTR_ERR(rule);
}
- return err;
+ return 0;
}
static int inherit_ruleset(struct landlock_ruleset *const parent,
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index ff163e5db5f0..5b7f554e8442 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -217,9 +217,10 @@ void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset);
DEFINE_FREE(landlock_put_ruleset, struct landlock_ruleset *,
if (!IS_ERR_OR_NULL(_T)) landlock_put_ruleset(_T))
-int landlock_insert_rule(struct landlock_ruleset *const ruleset,
- const struct landlock_id id,
- const access_mask_t access, const int flags);
+struct landlock_rule *
+landlock_insert_rule(struct landlock_ruleset *const ruleset,
+ const struct landlock_id id,
+ const access_mask_t access, const int flags);
struct landlock_ruleset *
landlock_merge_ruleset(struct landlock_ruleset *const parent,
--
2.53.0
^ permalink raw reply related
* [PATCH v8 04/10] landlock: Add LANDLOCK_ADD_RULE_NO_INHERIT user API
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
Wire up the new LANDLOCK_ADD_RULE_NO_INHERIT flag for
sys_landlock_add_rule(). Define the constant in the UAPI header with
its documentation, accept it from user space for
%LANDLOCK_RULE_PATH_BENEATH only, and update the path-beneath useless-
rule check so that an empty allowed_access is still accepted when a
flag (quiet or no-inherit) is present.
The flag has no enforcement effect yet; that is added in a subsequent
patch.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Renamed patch from "Implement LANDLOCK_ADD_RULE_NO_INHERIT
userspace api" to "Add LANDLOCK_ADD_RULE_NO_INHERIT user API".
* Reworded the UAPI documentation for LANDLOCK_ADD_RULE_NO_INHERIT
in include/uapi/linux/landlock.h for clarity.
* Centralized flag validation in sys_landlock_add_rule(): rejects
unknown flags with a single ~(QUIET | NO_INHERIT) mask, and
rejects NO_INHERIT on non-path-beneath rule types from the
syscall entry point.
* Removed the now-redundant LANDLOCK_ADD_RULE_NO_INHERIT check in
add_rule_net_port().
* Documented the new EINVAL case for NO_INHERIT on unsupported rule
types in the syscall kernel-doc.
include/uapi/linux/landlock.h | 24 ++++++++++++++++++++++++
security/landlock/syscalls.c | 14 +++++++++++---
2 files changed, 35 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index 90a0752b61bf..d6de209ab961 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -124,10 +124,34 @@ struct landlock_ruleset_attr {
* allowed_access in the passed in rule_attr. When this flag is
* present, the caller is also allowed to pass in an empty
* allowed_access.
+ * %LANDLOCK_ADD_RULE_NO_INHERIT
+ * Disable the inheritance of access rights and flags from parent objects
+ * for the rule's object and its descendants.
+ *
+ * This flag currently applies only to filesystem rules. Passing it with
+ * any other rule type returns ``-EINVAL``.
+ *
+ * By default, Landlock filesystem rules inherit allowed accesses from
+ * ancestor directories: rights granted on a parent directory also apply
+ * to its children. A rule marked with %LANDLOCK_ADD_RULE_NO_INHERIT
+ * stops this propagation at its object; only the accesses explicitly
+ * allowed by the rule apply. Descendants of that object continue to
+ * inherit from it normally, unless they too carry this flag.
+ *
+ * This flag also enforces parent-directory restrictions: rename, rmdir,
+ * link, and other operations that would change the immediate parent of
+ * the rule's object or any of its ancestors are denied up to the VFS
+ * root. This prevents sandboxed processes from manipulating the
+ * filesystem hierarchy to evade restrictions (e.g. via sandbox-restart
+ * attacks).
+ *
+ * Inheritance of rule flags (such as %LANDLOCK_ADD_RULE_QUIET) from
+ * ancestor directories is also blocked at the rule's object.
*/
/* clang-format off */
#define LANDLOCK_ADD_RULE_QUIET (1U << 0)
+#define LANDLOCK_ADD_RULE_NO_INHERIT (1U << 1)
/* clang-format on */
/**
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 08b6045d6926..04dacfdfc9f3 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -361,7 +361,7 @@ static int add_rule_path_beneath(struct landlock_ruleset *const ruleset,
/*
* Informs about useless rule: empty allowed_access (i.e. deny rules)
* are ignored in path walks. However, the rule is not useless if it
- * is there to hold a quiet flag.
+ * carries a flag (quiet or no-inherit).
*/
if (!flags && !path_beneath_attr.allowed_access)
return -ENOMSG;
@@ -433,7 +433,7 @@ static int add_rule_net_port(struct landlock_ruleset *ruleset,
* @rule_type: Identify the structure type pointed to by @rule_attr:
* %LANDLOCK_RULE_PATH_BENEATH or %LANDLOCK_RULE_NET_PORT.
* @rule_attr: Pointer to a rule (matching the @rule_type).
- * @flags: Must be 0 or %LANDLOCK_ADD_RULE_QUIET.
+ * @flags: Bitmask of %LANDLOCK_ADD_RULE_* flags.
*
* This system call enables to define a new rule and add it to an existing
* ruleset.
@@ -451,6 +451,8 @@ static int add_rule_net_port(struct landlock_ruleset *ruleset,
* - %EINVAL: &landlock_net_port_attr.port is greater than 65535;
* - %EINVAL: LANDLOCK_ADD_RULE_QUIET is passed but the ruleset has no
* quiet access bits set for the corresponding rule type.
+ * - %EINVAL: LANDLOCK_ADD_RULE_NO_INHERIT is passed for a rule type
+ * that does not support it (e.g. %LANDLOCK_RULE_NET_PORT).
* - %ENOMSG: Empty accesses (e.g. &landlock_path_beneath_attr.allowed_access is
* 0) and no flags;
* - %EBADF: @ruleset_fd is not a file descriptor for the current thread, or a
@@ -472,7 +474,13 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
if (!is_initialized())
return -EOPNOTSUPP;
- if (flags && flags != LANDLOCK_ADD_RULE_QUIET)
+ /* Rejects unknown flags. */
+ if (flags & ~(LANDLOCK_ADD_RULE_QUIET | LANDLOCK_ADD_RULE_NO_INHERIT))
+ return -EINVAL;
+
+ /* LANDLOCK_ADD_RULE_NO_INHERIT only applies to path-beneath rules. */
+ if ((flags & LANDLOCK_ADD_RULE_NO_INHERIT) &&
+ rule_type != LANDLOCK_RULE_PATH_BENEATH)
return -EINVAL;
/* Gets and checks the ruleset. */
--
2.53.0
^ permalink raw reply related
* [PATCH v8 03/10] landlock: Use landlock_walk_path_up() in collect_domain_accesses()
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
Replace the open-coded loop with landlock_walk_path_up() and change the
function signature from (mnt_root, dir) to a single struct path. The
caller's mount point and starting dentry are now both carried in @path,
which keeps the traversal logic consistent with
is_access_to_paths_allowed().
No functional change intended.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Reworded commit message.
* Changed collect_domain_accesses() to take a single struct path *
instead of separate mnt_root/dir parameters, simplifying the
interface and matching is_access_to_paths_allowed().
* Tightened the disconnected-directory stop condition to require
!d_unhashed(walker_path.dentry) when comparing against the mount
root, so disconnected bind-mount roots are not mistaken for the
real mount root.
security/landlock/fs.c | 82 ++++++++++++++++++++++--------------------
1 file changed, 44 insertions(+), 38 deletions(-)
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 8fb0aa59e180..6552351e0b9c 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -1032,48 +1032,51 @@ static access_mask_t maybe_remove(const struct dentry *const dentry)
* collect_domain_accesses - Walk through a file path and collect accesses
*
* @domain: Domain to check against.
- * @mnt_root: Last directory to check.
- * @dir: Directory to start the walk from.
+ * @path: Path to start the walk from and whose mount root is the last
+ * directory to check.
* @layer_masks_dom: Where to store the collected accesses.
*
- * This helper is useful to begin a path walk from the @dir directory to a
- * @mnt_root directory used as a mount point. This mount point is the common
- * ancestor between the source and the destination of a renamed and linked
- * file. While walking from @dir to @mnt_root, we record all the domain's
- * allowed accesses in @layer_masks_dom.
+ * This helper is useful to begin a path walk from @path to the mount root
+ * directory used as a mount point. This mount point is the common ancestor
+ * between the source and the destination of a renamed and linked file. While
+ * walking from @path to that mount root, we record all the domain's allowed
+ * accesses in @layer_masks_dom.
*
- * Because of disconnected directories, this walk may not reach @mnt_dir. In
- * this case, the walk will continue to @mnt_dir after this call.
+ * Because of disconnected directories, this walk may not reach that mount
+ * root. In this case, the walk will continue to the mount root after this
+ * call.
*
* This is similar to is_access_to_paths_allowed() but much simpler because it
* only handles walking on the same mount point and only checks one set of
* accesses.
*
- * Return: True if all the domain access rights are allowed for @dir, false if
- * the walk reached @mnt_root.
+ * Return: True if all the domain access rights are allowed for @path, false if
+ * the walk reached the mount root.
*/
-static bool collect_domain_accesses(const struct landlock_ruleset *const domain,
- const struct dentry *const mnt_root,
- struct dentry *dir,
- struct layer_masks *layer_masks_dom)
+static bool
+collect_domain_accesses(const struct landlock_ruleset *const domain,
+ const struct path *const path,
+ struct layer_masks *layer_masks_dom)
{
bool ret = false;
+ struct path walker_path;
- if (WARN_ON_ONCE(!domain || !mnt_root || !dir || !layer_masks_dom))
+ if (WARN_ON_ONCE(!domain || !path || !path->dentry || !path->mnt ||
+ !layer_masks_dom))
return true;
- if (is_nouser_or_private(dir))
+ if (is_nouser_or_private(path->dentry))
return true;
if (!landlock_init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS,
layer_masks_dom, LANDLOCK_KEY_INODE))
return true;
- dget(dir);
+ walker_path = *path;
+ path_get(&walker_path);
while (true) {
- struct dentry *parent_dentry;
-
/* Gets all layers allowing all domain accesses. */
- if (landlock_unmask_layers(find_rule(domain, dir),
+ if (landlock_unmask_layers(find_rule(domain,
+ walker_path.dentry),
layer_masks_dom)) {
/*
* Stops when all handled accesses are allowed by at
@@ -1084,17 +1087,19 @@ static bool collect_domain_accesses(const struct landlock_ruleset *const domain,
}
/*
- * Stops at the mount point or the filesystem root for a disconnected
- * directory.
+ * Stops at the mount point or the filesystem root for a
+ * disconnected directory.
*/
- if (dir == mnt_root || unlikely(IS_ROOT(dir)))
+ if ((walker_path.dentry == path->mnt->mnt_root &&
+ walker_path.mnt == path->mnt) ||
+ unlikely(IS_ROOT(walker_path.dentry)))
break;
- parent_dentry = dget_parent(dir);
- dput(dir);
- dir = parent_dentry;
+ if (WARN_ON_ONCE(landlock_walk_path_up(&walker_path) !=
+ LANDLOCK_WALK_CONTINUE))
+ break;
}
- dput(dir);
+ path_put(&walker_path);
return ret;
}
@@ -1160,7 +1165,7 @@ static int current_check_refer_path(struct dentry *const old_dentry,
bool allow_parent1, allow_parent2;
access_mask_t access_request_parent1, access_request_parent2;
struct path mnt_dir;
- struct dentry *old_parent;
+ struct path old_parent_path;
struct layer_masks layer_masks_parent1 = {}, layer_masks_parent2 = {};
struct landlock_request request1 = {}, request2 = {};
@@ -1214,18 +1219,19 @@ static int current_check_refer_path(struct dentry *const old_dentry,
/*
* old_dentry may be the root of the common mount point and
* !IS_ROOT(old_dentry) at the same time (e.g. with open_tree() and
- * OPEN_TREE_CLONE). We do not need to call dget(old_parent) because
- * we keep a reference to old_dentry.
+ * OPEN_TREE_CLONE). We do not need to call path_get(&old_parent_path)
+ * because we keep a reference to old_dentry.
*/
- old_parent = (old_dentry == mnt_dir.dentry) ? old_dentry :
- old_dentry->d_parent;
+ old_parent_path.mnt = mnt_dir.mnt;
+ old_parent_path.dentry = (old_dentry == mnt_dir.dentry) ?
+ old_dentry :
+ old_dentry->d_parent;
/* new_dir->dentry is equal to new_dentry->d_parent */
- allow_parent1 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
- old_parent,
+ allow_parent1 = collect_domain_accesses(subject->domain,
+ &old_parent_path,
&layer_masks_parent1);
- allow_parent2 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
- new_dir->dentry,
+ allow_parent2 = collect_domain_accesses(subject->domain, new_dir,
&layer_masks_parent2);
if (allow_parent1 && allow_parent2)
return 0;
@@ -1244,7 +1250,7 @@ static int current_check_refer_path(struct dentry *const old_dentry,
return 0;
if (request1.access) {
- request1.audit.u.path.dentry = old_parent;
+ request1.audit.u.path.dentry = old_parent_path.dentry;
landlock_log_denial(subject, &request1);
}
if (request2.access) {
--
2.53.0
^ permalink raw reply related
* [PATCH v8 02/10] landlock: Use landlock_walk_path_up() in is_access_to_paths_allowed()
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
Replace the open-coded path-walk loop with the new
landlock_walk_path_up() helper. This removes the backward goto and
keeps the traversal logic in a single place.
No functional change intended.
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Reworded commit message.
* Reordered switch arms so the LANDLOCK_WALK_CONTINUE fast path comes
first, and moved the per-case explanatory comments inside the case
bodies. No functional change.
security/landlock/fs.c | 55 ++++++++++++++----------------------------
1 file changed, 18 insertions(+), 37 deletions(-)
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 8e75583c3ca7..8fb0aa59e180 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -921,46 +921,27 @@ is_access_to_paths_allowed(const struct landlock_ruleset *const domain,
if (allowed_parent1 && allowed_parent2)
break;
-jump_up:
- if (walker_path.dentry == walker_path.mnt->mnt_root) {
- if (follow_up(&walker_path)) {
- /* Ignores hidden mount points. */
- goto jump_up;
- } else {
- /*
- * Stops at the real root. Denies access
- * because not all layers have granted access.
- */
- break;
- }
- }
-
- if (unlikely(IS_ROOT(walker_path.dentry))) {
- if (likely(walker_path.mnt->mnt_flags & MNT_INTERNAL)) {
- /*
- * Stops and allows access when reaching disconnected root
- * directories that are part of internal filesystems (e.g. nsfs,
- * which is reachable through /proc/<pid>/ns/<namespace>).
- */
- allowed_parent1 = true;
- allowed_parent2 = true;
- break;
- }
-
+ switch (landlock_walk_path_up(&walker_path)) {
+ case LANDLOCK_WALK_CONTINUE:
+ continue;
+ case LANDLOCK_WALK_INTERNAL:
/*
- * We reached a disconnected root directory from a bind mount.
- * Let's continue the walk with the mount point we missed.
+ * Stops and allows access when reaching disconnected
+ * root directories that are part of internal
+ * filesystems (e.g. nsfs, which is reachable through
+ * /proc/<pid>/ns/<namespace>).
*/
- dput(walker_path.dentry);
- walker_path.dentry = walker_path.mnt->mnt_root;
- dget(walker_path.dentry);
- } else {
- struct dentry *const parent_dentry =
- dget_parent(walker_path.dentry);
-
- dput(walker_path.dentry);
- walker_path.dentry = parent_dentry;
+ allowed_parent1 = true;
+ allowed_parent2 = true;
+ break;
+ case LANDLOCK_WALK_STOP_REAL_ROOT:
+ /*
+ * Stops at the real root. Denies access because not
+ * all layers have granted access.
+ */
+ break;
}
+ break;
}
path_put(&walker_path);
--
2.53.0
^ permalink raw reply related
* [PATCH v8 01/10] landlock: Add landlock_walk_path_up() helper
From: Justin Suess @ 2026-05-29 1:52 UTC (permalink / raw)
To: gnoack3000, mic
Cc: linux-kernel, linux-security-module, Justin Suess, Tingmao Wang
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>
In preparation for centralizing path-walk logic, add
landlock_walk_path_up(), which moves @path one step toward the VFS
root. Its return value indicates whether the new position is an
internal mount point, the real root, or neither (i.e. the caller
should continue walking).
No functional change intended.
Cc: Tingmao Wang <m@maowtm.org>
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---
Notes:
v7..v8 changes:
* Reworded commit message; no code changes.
security/landlock/fs.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 3b71f569a8f9..8e75583c3ca7 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -320,6 +320,38 @@ static struct landlock_object *get_inode_object(struct inode *const inode)
LANDLOCK_ACCESS_FS_RESOLVE_UNIX)
/* clang-format on */
+/**
+ * enum landlock_walk_result - Result codes for landlock_walk_path_up()
+ * @LANDLOCK_WALK_CONTINUE: Path is now neither the real root nor an internal mount point.
+ * @LANDLOCK_WALK_STOP_REAL_ROOT: Path has reached the real VFS root.
+ * @LANDLOCK_WALK_INTERNAL: Path has reached an internal mount point.
+ */
+enum landlock_walk_result {
+ LANDLOCK_WALK_CONTINUE,
+ LANDLOCK_WALK_STOP_REAL_ROOT,
+ LANDLOCK_WALK_INTERNAL,
+};
+
+static enum landlock_walk_result landlock_walk_path_up(struct path *const path)
+{
+ struct dentry *old;
+
+ while (path->dentry == path->mnt->mnt_root) {
+ if (!follow_up(path))
+ return LANDLOCK_WALK_STOP_REAL_ROOT;
+ }
+ old = path->dentry;
+ if (unlikely(IS_ROOT(old))) {
+ if (likely(path->mnt->mnt_flags & MNT_INTERNAL))
+ return LANDLOCK_WALK_INTERNAL;
+ path->dentry = dget(path->mnt->mnt_root);
+ } else {
+ path->dentry = dget_parent(old);
+ }
+ dput(old);
+ return LANDLOCK_WALK_CONTINUE;
+}
+
/*
* @path: Should have been checked by get_path_from_fd().
*/
--
2.53.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox