Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* [PATCH] selftests/landlock: explicitly disable audit
From: Maximilian Heyne @ 2026-05-29 20:03 UTC (permalink / raw)
  To: stable
  Cc: Maximilian Heyne, Mickaël Salaün, Günther Noack,
	Shuah Khan, linux-security-module, linux-kselftest, linux-kernel

I'm seeing sporadic selftest failures, such as

  #  RUN           scoped_audit.connect_to_child ...
  # scoped_abstract_unix_test.c:314:connect_to_child:Expected 0 (0) == records.access (8)
  # connect_to_child: Test failed
  #          FAIL  scoped_audit.connect_to_child
  not ok 19 scoped_audit.connect_to_child

This seems similar to what commit 3647a4977fb73d ("selftests/landlock:
Drain stale audit records on init") tried to fix. However, the added
drain loop is not effective. When setting the AUDIT_STATUS_PID, the
kauditd_thread is woken up starting to send messages from the hold queue
to the netlink. Depending on scheduling of this kthread not all messages
might be send via the netlink in the 1 us interval.

Therefore, instead of trying to drain the queue, let's just disable
audit when running non-audit tests or more precisely disable it after
audit-tests. This way we won't generate any new audit message that could
interfere with the other tests.

The comment saying that on process exit audit will be disabled is wrong.
The closed file descriptor just causes an auditd_reset(), not a
disablement. So future messages will be queued in the hold queue.

Cc: stable@vger.kernel.org
Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
---

I've seen the failures on the 6.18 kernels but haven't tested on latest
upstream. However, I still think this is an issue.

---
 tools/testing/selftests/landlock/audit.h | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
index 834005b2b0f09..7842330875f53 100644
--- a/tools/testing/selftests/landlock/audit.h
+++ b/tools/testing/selftests/landlock/audit.h
@@ -494,10 +494,9 @@ static int audit_init_filter_exe(struct audit_filter *filter, const char *path)
 static int audit_cleanup(int audit_fd, struct audit_filter *filter)
 {
 	struct audit_filter new_filter;
+	int err;

 	if (audit_fd < 0 || !filter) {
-		int err;
-
 		/*
 		 * Simulates audit_init_with_exe_filter() when called from
 		 * FIXTURE_TEARDOWN_PARENT().
@@ -518,12 +517,10 @@ static int audit_cleanup(int audit_fd, struct audit_filter *filter)
 	audit_filter_exe(audit_fd, filter, AUDIT_DEL_RULE);
 	audit_filter_drop(audit_fd, AUDIT_DEL_RULE);

-	/*
-	 * Because audit_cleanup() might not be called by the test auditd
-	 * process, it might not be possible to explicitly set it.  Anyway,
-	 * AUDIT_STATUS_ENABLED will implicitly be set to 0 when the auditd
-	 * process will exit.
-	 */
+	err = audit_set_status(audit_fd, AUDIT_STATUS_ENABLED, 0);
+	if (err)
+		return err;
+
 	return close(audit_fd);
 }

-- 
2.50.1

Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597

^ permalink raw reply related

* [syzbot] [lsm?] KASAN: slab-use-after-free Read in security_inode_follow_link
From: syzbot @ 2026-05-29 20:01 UTC (permalink / raw)
  To: jmorris, linux-kernel, linux-security-module, paul, serge,
	syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    eb3f4b7426cf Merge tag 'nfsd-7.1-2' of git://git.kernel.or..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17dae52e580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8118209836970b54
dashboard link: https://syzkaller.appspot.com/bug?extid=0962e3a1af6d5e26a52c
compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14427ed2580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=109e452e580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-eb3f4b74.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1406171f5cf6/vmlinux-eb3f4b74.xz
kernel image: https://storage.googleapis.com/syzbot-assets/19b8bb9b727a/bzImage-eb3f4b74.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0962e3a1af6d5e26a52c@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: slab-use-after-free in security_inode_follow_link+0x277/0x280 security/security.c:1819
Read of size 4 at addr ffff8880393e0004 by task syz.0.594/8610

CPU: 3 UID: 0 PID: 8610 Comm: syz.0.594 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0x13d/0x4b0 mm/kasan/report.c:482
 kasan_report+0xdf/0x1d0 mm/kasan/report.c:595
 security_inode_follow_link+0x277/0x280 security/security.c:1819
 pick_link+0x433/0x13c0 fs/namei.c:2049
 step_into_slowpath+0x9ba/0xf90 fs/namei.c:2123
 step_into fs/namei.c:2148 [inline]
 walk_component fs/namei.c:2284 [inline]
 link_path_walk+0xf28/0x1cc0 fs/namei.c:2652
 path_parentat fs/namei.c:2856 [inline]
 __filename_parentat+0x213/0x740 fs/namei.c:2880
 filename_parentat fs/namei.c:2898 [inline]
 filename_unlinkat+0xf7/0x730 fs/namei.c:5537
 __do_sys_unlinkat fs/namei.c:5597 [inline]
 __se_sys_unlinkat fs/namei.c:5589 [inline]
 __x64_sys_unlinkat+0xc0/0x130 fs/namei.c:5589
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x115/0x870 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f60b0f9ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f60b1dc9028 EFLAGS: 00000246 ORIG_RAX: 0000000000000107
RAX: ffffffffffffffda RBX: 00007f60b1215fa0 RCX: 00007f60b0f9ce59
RDX: 0000000000000000 RSI: 00002000000001c0 RDI: 0000000000000006
RBP: 00007f60b1032d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f60b1216038 R14: 00007f60b1215fa0 R15: 00007ffddfc81a28
 </TASK>

Allocated by task 8610:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:57
 kasan_save_track+0x14/0x30 mm/kasan/common.c:78
 unpoison_slab_object mm/kasan/common.c:340 [inline]
 __kasan_slab_alloc+0x89/0x90 mm/kasan/common.c:366
 kasan_slab_alloc include/linux/kasan.h:253 [inline]
 slab_post_alloc_hook mm/slub.c:4570 [inline]
 slab_alloc_node mm/slub.c:4899 [inline]
 kmem_cache_alloc_lru_noprof+0x246/0x6e0 mm/slub.c:4918
 alloc_inode+0x183/0x250 fs/inode.c:347
 new_inode+0x22/0x1c0 fs/inode.c:1179
 bpf_get_inode kernel/bpf/inode.c:117 [inline]
 bpf_get_inode kernel/bpf/inode.c:102 [inline]
 bpf_symlink+0x69/0x240 kernel/bpf/inode.c:391
 vfs_symlink fs/namei.c:5643 [inline]
 vfs_symlink+0x178/0x4d0 fs/namei.c:5622
 filename_symlinkat+0x2a6/0x560 fs/namei.c:5668
 __do_sys_symlinkat fs/namei.c:5688 [inline]
 __se_sys_symlinkat fs/namei.c:5683 [inline]
 __x64_sys_symlinkat+0x9c/0xe0 fs/namei.c:5683
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x115/0x870 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 8611:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:57
 kasan_save_track+0x14/0x30 mm/kasan/common.c:78
 kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:584
 poison_slab_object mm/kasan/common.c:253 [inline]
 __kasan_slab_free+0x5f/0x80 mm/kasan/common.c:285
 kasan_slab_free include/linux/kasan.h:235 [inline]
 slab_free_hook mm/slub.c:2689 [inline]
 slab_free mm/slub.c:6251 [inline]
 kmem_cache_free+0x127/0x6c0 mm/slub.c:6378
 destroy_inode+0xcb/0x1c0 fs/inode.c:394
 evict+0x599/0xad0 fs/inode.c:865
 iput_final fs/inode.c:1960 [inline]
 iput.part.0+0x605/0xf50 fs/inode.c:2009
 iput+0x35/0x40 fs/inode.c:1975
 filename_unlinkat+0x466/0x730 fs/namei.c:5572
 __do_sys_unlinkat fs/namei.c:5597 [inline]
 __se_sys_unlinkat fs/namei.c:5589 [inline]
 __x64_sys_unlinkat+0xc0/0x130 fs/namei.c:5589
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x115/0x870 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff8880393e0000
 which belongs to the cache inode_cache of size 1088
The buggy address is located 4 bytes inside of
 freed 1088-byte region [ffff8880393e0000, ffff8880393e0440)

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x393e0
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
memcg:ffff88803ce9f201
flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 00fff00000000040 ffff888100090000 dead000000000100 dead000000000122
raw: 0000000000000000 00000008001a001a 00000000f5000000 ffff88803ce9f201
head: 00fff00000000040 ffff888100090000 dead000000000100 dead000000000122
head: 0000000000000000 00000008001a001a 00000000f5000000 ffff88803ce9f201
head: 00fff00000000003 fffffffffffffe01 00000000ffffffff 00000000ffffffff
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Reclaimable, gfp_mask 0xd20d0(__GFP_RECLAIMABLE|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 5753, tgid 5753 (syz-execprog), ts 103858448509, free_ts 0
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0xfd/0x120 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x11a6/0x3410 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x27c/0x2bc0 mm/page_alloc.c:5221
 alloc_slab_page mm/slub.c:3278 [inline]
 allocate_slab mm/slub.c:3467 [inline]
 new_slab+0xa6/0x6c0 mm/slub.c:3525
 refill_objects+0x277/0x420 mm/slub.c:7272
 refill_sheaf mm/slub.c:2816 [inline]
 __pcs_replace_empty_main+0x375/0x650 mm/slub.c:4652
 alloc_from_pcs mm/slub.c:4750 [inline]
 slab_alloc_node mm/slub.c:4884 [inline]
 kmem_cache_alloc_lru_noprof+0x485/0x6e0 mm/slub.c:4918
 alloc_inode+0x183/0x250 fs/inode.c:347
 iget_locked+0x1d9/0x6d0 fs/inode.c:1474
 kernfs_get_inode+0x46/0x470 fs/kernfs/inode.c:252
 kernfs_iop_lookup+0x1a7/0x2d0 fs/kernfs/dir.c:1274
 lookup_open.isra.0+0x631/0x11b0 fs/namei.c:4484
 open_last_lookups fs/namei.c:4611 [inline]
 path_openat+0xa98/0x31a0 fs/namei.c:4855
 do_file_open+0x20e/0x430 fs/namei.c:4887
 do_sys_openat2+0x10d/0x1e0 fs/open.c:1364
 do_sys_open fs/open.c:1370 [inline]
 __do_sys_openat fs/open.c:1386 [inline]
 __se_sys_openat fs/open.c:1381 [inline]
 __x64_sys_openat+0x12d/0x210 fs/open.c:1381
page_owner free stack trace missing

Memory state around the buggy address:
 ffff8880393dff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff8880393dff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8880393e0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff8880393e0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880393e0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply

* [PATCH v3 2/2] selftests/landlock: test SCOPE_SIGNAL on the SIGIO/fowner pgid path
From: hexlabsecurity @ 2026-05-29 19:08 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Justin Suess, gnoack@google.com,
	linux-security-module@vger.kernel.org, stable@vger.kernel.org

From 06174d6988915949342c86fe4d1ee210571a2321 Mon Sep 17 00:00:00 2001
From: Bryam Vargas <hexlabsecurity@proton.me>
Date: Fri, 29 May 2026 12:51:27 -0500
Subject: [PATCH v3 2/2] selftests/landlock: test SCOPE_SIGNAL on the
 SIGIO/fowner pgid path

Add a regression test for the LANDLOCK_SCOPE_SIGNAL bypass on the
asynchronous SIGIO delivery path.  A sandboxed task that owns a file via
fcntl(F_SETOWN, -pgrp) while sitting at the head of its process group's
PID hlist (the default position after fork()) used to have its Landlock
subject capture skipped, letting the SIGIO fan-out reach non-sandboxed
members of the process group.

The test creates a dedicated process group, sandboxes the (hlist-head)
child with LANDLOCK_SCOPE_SIGNAL, arms F_SETSIG(SIGURG) / F_SETOWN(-pgrp)
/ O_ASYNC on a pipe and triggers the fan-out.  The in-domain child must
receive the signal (proving the trigger fired); the non-sandboxed parent,
which is outside the child's domain, must not.  Without the fix the parent
is signaled and the test fails.

Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
 .../selftests/landlock/scoped_signal_test.c   | 97 +++++++++++++++++++
 1 file changed, 97 insertions(+)

diff --git a/tools/testing/selftests/landlock/scoped_signal_test.c b/tools/testing/selftests/landlock/scoped_signal_test.c
index d8bf33417619..05151929c263 100644
--- a/tools/testing/selftests/landlock/scoped_signal_test.c
+++ b/tools/testing/selftests/landlock/scoped_signal_test.c
@@ -559,4 +559,101 @@ TEST_F(fown, sigurg_socket)
 		_metadata->exit_code = KSFT_FAIL;
 }
 
+/*
+ * Checks that LANDLOCK_SCOPE_SIGNAL is enforced on the asynchronous SIGIO
+ * delivery path (fcntl(F_SETOWN)) when the file owner is a process group.
+ *
+ * A sandboxed task sitting at the head of its process group's PID hlist (the
+ * default position right after fork()) used to escape the
+ * fcntl(F_SETOWN, -pgrp) subject capture: pid_task(pgrp, PIDTYPE_PGID)
+ * resolved to the task itself, so the same-thread-group exemption skipped
+ * recording its Landlock domain.  At SIGIO time the cached subject was then
+ * empty and the signal fanned out to every group member, including
+ * non-sandboxed tasks outside the domain.
+ */
+TEST(sigio_to_pgid_members)
+{
+	int trigger[2], sync_child[2];
+	char buf;
+	pid_t child;
+	int status, i;
+
+	drop_caps(_metadata);
+
+	/*
+	 * Isolates the test in its own process group so the SIGIO fan-out
+	 * stays bounded to this parent and the child forked below.
+	 */
+	ASSERT_EQ(0, setpgid(0, 0));
+
+	/* The non-sandboxed parent is the protected (out-of-domain) target. */
+	ASSERT_EQ(0, setup_signal_handler(SIGURG));
+	signal_received = 0;
+
+	ASSERT_EQ(0, pipe2(trigger, O_CLOEXEC));
+	ASSERT_EQ(0, pipe2(sync_child, O_CLOEXEC));
+
+	child = fork();
+	ASSERT_LE(0, child);
+	if (child == 0) {
+		/*
+		 * The child inherits the parent's new process group and, just
+		 * attached with hlist_add_head_rcu(), is now the head of the
+		 * pgid hlist: this is the case that used to skip the capture.
+		 */
+		EXPECT_EQ(0, close(sync_child[0]));
+
+		/* In-domain positive control: the child must be signaled. */
+		ASSERT_EQ(0, setup_signal_handler(SIGURG));
+		signal_received = 0;
+
+		create_scoped_domain(_metadata, LANDLOCK_SCOPE_SIGNAL);
+
+		/* Owns the SIGIO source for the whole process group. */
+		ASSERT_EQ(0, fcntl(trigger[0], F_SETSIG, SIGURG));
+		ASSERT_EQ(0, fcntl(trigger[0], F_SETOWN, -getpgrp()));
+		ASSERT_EQ(0, fcntl(trigger[0], F_SETFL, O_ASYNC));
+
+		/* Fans SIGURG out to every member of the process group. */
+		ASSERT_EQ(1, write(trigger[1], ".", 1));
+
+		/*
+		 * The sandboxed child is in its own domain and must always be
+		 * signaled: this proves the SIGIO actually fired.
+		 */
+		for (i = 0; i < 1000 && !signal_received; i++)
+			usleep(1000);
+		EXPECT_EQ(1, signal_received);
+
+		ASSERT_EQ(1, write(sync_child[1], ".", 1));
+		EXPECT_EQ(0, close(sync_child[1]));
+
+		_exit(_metadata->exit_code);
+		return;
+	}
+	EXPECT_EQ(0, close(sync_child[1]));
+	EXPECT_EQ(0, close(trigger[0]));
+	EXPECT_EQ(0, close(trigger[1]));
+
+	/* Waits for the child to generate the SIGIO. */
+	ASSERT_EQ(1, read(sync_child[0], &buf, 1));
+	EXPECT_EQ(0, close(sync_child[0]));
+
+	/* Lets a delivered-but-pending signal run our handler, if any. */
+	for (i = 0; i < 100 && !signal_received; i++)
+		usleep(1000);
+
+	/*
+	 * SCOPE_SIGNAL must block the fan-out to this non-sandboxed parent,
+	 * which is outside the child's Landlock domain.  Before the fix the
+	 * parent was signaled here.
+	 */
+	EXPECT_EQ(0, signal_received);
+
+	ASSERT_EQ(child, waitpid(child, &status, 0));
+	if (WIFSIGNALED(status) || !WIFEXITED(status) ||
+	    WEXITSTATUS(status) != EXIT_SUCCESS)
+		_metadata->exit_code = KSFT_FAIL;
+}
+
 TEST_HARNESS_MAIN
-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to invoker's pgid
From: hexlabsecurity @ 2026-05-29 19:07 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Justin Suess, gnoack@google.com,
	linux-security-module@vger.kernel.org, stable@vger.kernel.org

From b5fdc79ce1cb2881d59dfed01d3d9170306be9e8 Mon Sep 17 00:00:00 2001
From: Bryam Vargas <hexlabsecurity@proton.me>
Date: Fri, 29 May 2026 12:49:41 -0500
Subject: [PATCH v3 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via
 F_SETOWN to invoker's pgid

A Landlock-restricted process can bypass LANDLOCK_SCOPE_SIGNAL on the
SIGIO delivery path and deliver arbitrary signals (including SIGKILL via
F_SETSIG) to non-Landlocked targets that share its pgid, by exploiting a
producer-side cache-vs-live evaluation gap.

The SIGIO path in hook_file_send_sigiotask() consults a cached subject
stored in landlock_file(file)->fown_subject at fcntl(F_SETOWN) time
(via hook_file_set_fowner()), instead of evaluating the live Landlock
domain of the invoking task at signal-send time. The capture is gated
by control_current_fowner(), which returns false (skipping capture)
when pid_task(fown->pid, fown->pid_type) is in current's thread group.

This is correct for PIDTYPE_TGID / PIDTYPE_PID, where the target is a
single task sharing current's cred. It is unsafe for PIDTYPE_PGID and
PIDTYPE_SID: when current is at the head of its pgid hlist -- the
default placement after fork(), hlist_add_head_rcu() in kernel/fork.c --
pid_task(pgid, PIDTYPE_PGID) resolves to current itself,
same_thread_group(current, current) is true, the capture is skipped, and
fown_subject.domain stays NULL. hook_file_send_sigiotask() then
short-circuits at "if (!subject->domain) return 0;", letting the kernel
fan the signal out to every member of the group, including tasks outside
current's Landlock domain that SCOPE_SIGNAL is supposed to protect.

The direct kill() path (hook_task_kill) is unaffected: it evaluates
current's live domain on every call. Only the cached SIGIO path is
broken.

Tighten control_current_fowner() to apply the thread-group exemption
only when the target identifies a single task whose Landlock cred is
necessarily shared with current (PIDTYPE_TGID, PIDTYPE_PID). For
PIDTYPE_PGID and PIDTYPE_SID, always capture the current Landlock
subject so the consumer's scope check runs against every member of the
group at delivery time.

Stable kernels before the fown_subject conversion store the domain in
landlock_file(file)->fown_domain; control_current_fowner() is identical
there, so the same exemption and the same fix apply.

Fixes: 18eb75f3af40 ("landlock: Always allow signals between threads of the same process")
Cc: stable@vger.kernel.org
Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
Tested-by: Justin Suess <utilityemal77@gmail.com>
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
 security/landlock/fs.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index c1ecfe239032..edaa52572cbd 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -1909,6 +1909,18 @@ static bool control_current_fowner(struct fown_struct *const fown)
 	if (!p)
 		return true;

+	/*
+	 * For PIDTYPE_PGID and PIDTYPE_SID, signal delivery fans out to
+	 * every member of the group at SIGIO time. Even when pid_task()
+	 * resolves to current itself (e.g., current is the pgid hlist
+	 * head post-fork), non-current members of the group are still
+	 * valid targets that must be checked by hook_file_send_sigiotask().
+	 * Always capture the current subject for those types so the
+	 * consumer scope check runs against the live fown_subject.
+	 */
+	if (fown->pid_type == PIDTYPE_PGID || fown->pid_type == PIDTYPE_SID)
+		return true;
+
 	return !same_thread_group(p, current);
 }

-- 
2.43.0

^ permalink raw reply related

* Re: [REPORT] landlock: SCOPE_SIGNAL bypass via F_SETOWN to invoker pgid -> SIGIO/SIGKILL to non-sandboxed targets
From: hexlabsecurity @ 2026-05-29 19:03 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Justin Suess, gnoack@google.com,
	linux-security-module@vger.kernel.org, stable@vger.kernel.org
In-Reply-To: <20260529.li6kaiDaim4B@digikod.net>

Hi Mickaël,

> Could you please replace the reproducer code with a proper kselftest?
> That would need to be a new email patch (v3) [...]

Done -- v3 is a two-patch series:

  [PATCH v3 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to invoker's pgid
  [PATCH v3 2/2] selftests/landlock: test SCOPE_SIGNAL on the SIGIO/fowner pgid path

Patch 2 replaces the informal reproducer with a regression test in
scoped_signal_test.c, reusing the existing fown/SIGURG idiom. It adds
TEST(sigio_to_pgid_members): a sandboxed child at the head of its pgid hlist
arms F_SETSIG(SIGURG) / F_SETOWN(-pgrp) / O_ASYNC and triggers the fan-out; the
in-domain child must be signaled (positive control) and the non-sandboxed
parent must not.

I also added the Fixes: tag and Cc: stable that v2 was missing:

  Fixes: 18eb75f3af40 ("landlock: Always allow signals between threads of the same process")

That is where the same-thread-group exemption on the fowner path was
introduced (v6.15; backported to 6.12.y/6.13.y/6.14.y -- the original v6.12
signal scoping captured the subject unconditionally and was not affected).
The fix hunk itself is unchanged from v1/v2 and keeps Justin's Tested-by.

A/B on 6.12.90 + CONFIG_SECURITY_LANDLOCK (same .config, only the hunk
differs): without patch 1 the new test fails (the parent is signaled); with it
the test passes and the landlock signal-scoping suite is 20/20. checkpatch is
clean except one expected Reported-by/Closes warning -- the original report was
sent to security@kernel.org, so there is no public URL to point Closes: at.

Thanks,
Bryam Vargas

Independent security researcher. HEXLAB SAS (registration pending) -- Cali, Colombia.

This series fixes a LANDLOCK_SCOPE_SIGNAL bypass on the asynchronous SIGIO
(fcntl(F_SETOWN)) delivery path and adds the kselftest requested in review.

Patch 1 narrows the same-thread-group exemption in control_current_fowner()
so that F_SETOWN to a process group (or session) always captures the caller's
Landlock subject. Without it, a sandboxed task at the head of its pgid hlist
(the default position after fork()) skips the capture, and the SIGIO fan-out
reaches non-sandboxed members of the process group, defeating SCOPE_SIGNAL.
The direct kill() path (hook_task_kill) is unaffected.

Patch 2 adds a regression test to scoped_signal_test.c, replacing the informal
reproducer that previously accompanied the fix.

The defect was introduced by commit 18eb75f3af40 ("landlock: Always allow
signals between threads of the same process") in v6.15, and is present in the
stable branches that backported it (6.12.y, 6.13.y, 6.14.y).
control_current_fowner() is identical across those branches, so patch 1 applies
as-is (stable kernels before the fown_subject conversion store the domain in
landlock_file(file)->fown_domain; the exemption and the fix are the same).

A/B verified on 6.12.90 + CONFIG_SECURITY_LANDLOCK (same .config, only the fix
hunk differs):
  - without patch 1: the new test fails -- the non-sandboxed parent receives
    the signal (SCOPE_SIGNAL bypassed);
  - with patch 1: the new test passes, and the whole landlock signal-scoping
    suite passes 20/20 (no regression).

v2 -> v3:
  - patch 1: add Fixes: tag and Cc: stable; the fix hunk is unchanged from v1/v2.
  - patch 2 (new): replace the git-notes reproducer with a kselftest.
  - v1/v2 were sent to security@kernel.org (embargoed; not in a public archive).

Bryam Vargas (2):
  landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to invoker's pgid
  selftests/landlock: test SCOPE_SIGNAL on the SIGIO/fowner pgid path

 security/landlock/fs.c                        | 12 +++
 .../selftests/landlock/scoped_signal_test.c   | 97 +++++++++++++++++++
 2 files changed, 109 insertions(+)

base-commit: 27fa82620cbaa89a7fc11ac3057701d598813e87

^ permalink raw reply

* Re: [PATCH 05/11] hornet: gen_sig: fix off-by-one check for used maps
From: Paul Moore @ 2026-05-29 18:54 UTC (permalink / raw)
  To: Blaise Boscaccy
  Cc: Jonathan Corbet, Shuah Khan, James Morris, Serge E. Hallyn,
	Eric Biggers, Fan Wu, James.Bottomley, linux-security-module
In-Reply-To: <87tsrqxfdc.fsf@microsoft.com>

On Fri, May 29, 2026 at 2:03 PM Blaise Boscaccy
<bboscaccy@linux.microsoft.com> wrote:
> Paul Moore <paul@paul-moore.com> writes:
>
> > On Wed, May 27, 2026 at 11:09 PM Blaise Boscaccy
> > <bboscaccy@linux.microsoft.com> wrote:
> >>
> >> A logic bug limited the maximum number of used maps to
> >> MAX_USED_MAPS-1.
> >
> > Should this be MAX_HASHES-1 and not MAX_USED_MAPS-1?
>
> Good eye. Yes that should be MAX_HASHES-1 in the commit message.

Okay, I'll fix that up when merging.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH 05/11] hornet: gen_sig: fix off-by-one check for used maps
From: Blaise Boscaccy @ 2026-05-29 18:03 UTC (permalink / raw)
  To: Paul Moore
  Cc: Jonathan Corbet, Shuah Khan, James Morris, Serge E. Hallyn,
	Eric Biggers, Fan Wu, James.Bottomley, linux-security-module
In-Reply-To: <CAHC9VhR6G5qmd3kXPas_L_SiJx=6J=wUw80xxL9Eu4=tSjMAoQ@mail.gmail.com>

Paul Moore <paul@paul-moore.com> writes:

> On Wed, May 27, 2026 at 11:09 PM Blaise Boscaccy
> <bboscaccy@linux.microsoft.com> wrote:
>>
>> A logic bug limited the maximum number of used maps to
>> MAX_USED_MAPS-1.
>
> Should this be MAX_HASHES-1 and not MAX_USED_MAPS-1?
>

Good eye. Yes that should be MAX_HASHES-1 in the commit message.

>> Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
>> ---
>>  scripts/hornet/gen_sig.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/scripts/hornet/gen_sig.c b/scripts/hornet/gen_sig.c
>> index b4f983ab24bcd..4e8caad22f381 100644
>> --- a/scripts/hornet/gen_sig.c
>> +++ b/scripts/hornet/gen_sig.c
>> @@ -317,11 +317,11 @@ int main(int argc, char **argv)
>>                         data_path = optarg;
>>                         break;
>>                 case 'A':
>> -                       hashes[hash_count].file = optarg;
>> -                       if (++hash_count >= MAX_HASHES) {
>> +                       if (hash_count >= MAX_HASHES) {
>>                                 usage(argv[0]);
>>                                 return EXIT_FAILURE;
>>                         }
>> +                       hashes[hash_count++].file = optarg;
>>                         break;
>>                 default:
>>                         usage(argv[0]);
>> --
>> 2.53.0
>
> -- 
> paul-moore.com

^ permalink raw reply

* Re: [PATCH bpf v3 2/2] bpf, libbpf: reject non-exclusive metadata maps in the signed loader
From: Alexei Starovoitov @ 2026-05-29 15:01 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: KP Singh, bpf, LSM List, Alexei Starovoitov,
	Kumar Kartikeya Dwivedi
In-Reply-To: <544dbc0d-24d2-423f-9db4-07976d67a9d0@iogearbox.net>

On Fri, May 29, 2026 at 5:25 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 5/23/26 5:12 PM, Alexei Starovoitov wrote:
> > On Fri, May 22, 2026 at 11:53 PM KP Singh <kpsingh@kernel.org> wrote:
> >>
> >> The loader verifies map->sha against the metadata hash in its
> >> instructions. map->sha is calculated when BPF_OBJ_GET_INFO_BY_FD is called
> >> on the frozen map.
> >>
> >> While the map is frozen, the loader must also ensure the map is
> >> exclusive, as, without exclusivity, another BPF program with map access
> >> can mutate the contents afterwards, so the check passes on stale data.
> >
> > Hold on. How is this an issue? excl_prog_sha guarantees
> > that only loader prog can use this map.
> > Are you saying the same loader prog will use the same map
> > for the 2nd time. Ok. I still don't see a problem.
> >
> >> Place excl_prog_sha right after sha[] in struct bpf_map and have
> >> gen_loader bail with -EINVAL when it is NULL, via BPF_PSEUDO_MAP_IDX at
> >> fixed offset 32. The 8-byte read of the pointer field limits this to
> >> 64-bit kernels; gen_loader needs target pointer size tracking to emit
> >> the right sized read on 32-bit (follow-up).
> >
> > I don't think we can go from maybe-racy to certainly-broken-on-32-bit.
> > So only applied patch 1.
>
> I've looked a bit more into it with regards to above question from Alexei
> as well as the __bpf_md_ptr issue.
>
> Imho, KP is correct that the extra check/enforcement is needed. So Alice
> as a trusted signer generates the loader program (loader_insns + data_blob)
> and signs it. The loader program contains the below enforcement to reject
> if the metadata map was not exclusive.
>
> Now the (untrusted) host that wants to load the program, it holds a signed
> loader where they can't change a byte of it without breaking the signature.
>
> However, it could simply omit excl_prog_hash on BPF_MAP_CREATE for the data
> map (which would "normally" be bound exclusively to the loader).
>
> Then check_map_prog_compatibility() enforcement is skipped on verifier side
> given excl_prog_sha is not set. The loader loads fine, the fingerprint check
> can then pass against a stale snapshot while a different program mangled the
> data_blob underneath.
>
> Regarding __bpf_md_ptr, I would solve it differently via fixed size, see below
> together with the excl check coming before the signature check in the loader
> and the build bug assertions, and a jmp not eq to 1.
>
>   include/linux/bpf.h        |  1 +
>   kernel/bpf/syscall.c       |  5 +++++
>   tools/lib/bpf/gen_loader.c | 17 +++++++++++++++++
>   3 files changed, 23 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index cd191c5fdb0a..487f4653d8a6 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -295,6 +295,7 @@ struct bpf_map_owner {
>
>   struct bpf_map {
>         u8 sha[SHA256_DIGEST_SIZE];
> +       u32 excl;
>         const struct bpf_map_ops *ops;
>         struct bpf_map *inner_map_meta;
>   #ifdef CONFIG_SECURITY
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 630d530782fe..37dacdbc5c01 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1572,6 +1572,11 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
>                         err = -EFAULT;
>                         goto free_map;
>                 }
> +
> +               /* See libbpf: emit_signature_match() */
> +               BUILD_BUG_ON(offsetof(struct bpf_map, excl) != SHA256_DIGEST_SIZE);
> +               BUILD_BUG_ON(offsetof(struct bpf_map, sha)  != 0);
> +               map->excl = 1;
>         } else if (attr->excl_prog_hash_size) {
>                 err = -EINVAL;
>                 goto free_map;
> diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
> index bcea21c3b7bb..cd8d7df94ac7 100644
> --- a/tools/lib/bpf/gen_loader.c
> +++ b/tools/lib/bpf/gen_loader.c
> @@ -586,6 +586,23 @@ static void emit_signature_match(struct bpf_gen *gen)
>         __s64 off;
>         int i;
>
> +       /*
> +        * Reject if the metadata map is not exclusive. Without exclusivity
> +        * the cached map->sha[] verified above can be stale: another BPF
> +        * program with map access could have mutated the contents between
> +        * BPF_OBJ_GET_INFO_BY_FD and loader execution.
> +        */
> +       emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX,
> +                                        0, 0, 0, 0));
> +       emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, SHA256_DIGEST_LENGTH));
> +       off = -(gen->insn_cur - gen->insn_start - gen->cleanup_label) / 8 - 2;
> +       if (is_simm16(off)) {
> +               emit(gen, BPF_MOV64_IMM(BPF_REG_7, -EINVAL));
> +               emit(gen, BPF_JMP_IMM(BPF_JNE, BPF_REG_2, 1, off));
> +       } else {
> +               gen->error = -ERANGE;
> +       }

yeah. much cleaner. ship it.

^ permalink raw reply

* Re: [PATCH v5 12/13] ima: Return error on deleting measurements already copied during kexec
From: Roberto Sassu @ 2026-05-29 14:59 UTC (permalink / raw)
  To: Mimi Zohar, corbet, skhan, dmitry.kasatkin, eric.snowberg, paul,
	jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, chenste, nramas, Roberto Sassu
In-Reply-To: <ea886419ef3047ede1885504fad8f865cdcc5ce3.camel@linux.ibm.com>

On Tue, 2026-05-26 at 10:02 -0400, Mimi Zohar wrote:
> On Wed, 2026-04-29 at 18:03 +0200, Roberto Sassu wrote:
> > From: Roberto Sassu <roberto.sassu@huawei.com>
> > 
> > Refuse to delete staged or active list measurements, if a kexec racing with
> > the deletion already copied those measurements in the kexec buffer. In this
> > way, user space becomes aware that those measurements are going to appear
> > in the secondary kernel, and thus they don't have to be saved twice.
> 
> There are two reboot notifiers: one to prevent additional measurements extending
> the TPM, while the other copies the measurements for kexec.  This patch prevents
> deleting the staged measurements after the latter notifier.
> 
> Instead of introducing a specific method for detecting whether the measurement
> list has been copied, rely on one of the two existing reboot notifiers. The
> simplest method would test "ima_measurements_suspended", which would prevent
> deleting the staged measurements a bit earlier.

Testing that the reboot notifier fired (with the
ima_measurements_suspended variable) is not enough to know whether the
measurements dump took place or not.

We need a flag (one is enough) protected by ima_extend_list_mutex, so
that we know reliably which event occurred first, or the dump or the
staging/delete (which are also protected by ima_extend_list_mutex).


Roberto


^ permalink raw reply

* Re: [PATCH] tpm-buf: memory-safe allocations
From: James Bottomley @ 2026-05-29 14:08 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-integrity, Jarkko Sakkinen, Arun Menon, Daniel P. Smith,
	Alec Brown, Ross Philipson, Stefan Berger, Peter Huewe,
	Jason Gunthorpe, Mimi Zohar, David Howells, Paul Moore,
	James Morris, Serge E. Hallyn, linux-kernel, keyrings,
	linux-security-module
In-Reply-To: <ahVRefyT4BTKOu0m@kernel.org>

On Tue, 2026-05-26 at 10:53 +0300, Jarkko Sakkinen wrote:
> On Mon, May 25, 2026 at 01:50:51PM -0400, James Bottomley wrote:
> > On Fri, 2026-05-22 at 04:35 +0300, Jarkko Sakkinen wrote:
> > > Decouple kzalloc from buffer creation, so that a managed
> > > allocation
> > > can be
> > > used:
> > > 
> > > 	struct tpm_buf *buf __free(kfree) buf =
> > > kzalloc(TPM_BUFSIZE,
> > > 						GFP_KERNEL);
> > > 	if (!buf)
> > > 		return -ENOMEM;
> > > 
> > > 	tpm_buf_init(buf, TPM_BUFSIZE);
> > > 
> > > Alternatively, stack allocations are also possible:
> > > 
> > > 	u8 buf_data[512];
> > > 	struct tpm_buf *buf = (struct tpm_buf *)buf_data;
> > > 	tpm_buf_init(buf, sizeof(buf_data));
> > 
> > This isn't really a good idea from a security point of view.
> >  Remember the buffer has to be big enough for both the sent and the
> > received data.  Today we simply set TPM_BUFSIZE to the maximum
> > amount a TPM requires and all the send and receives just work.  If
> > we let callers set this size, we're asking for them to get it wrong
> > (or at least forget about the receive part) and for us to get a DMA
> > overrun from the TPM ... which might be potentially exploitable
> > depending on how it occurs (think of an unseal of user chosen data
> > overrunning).
> 
> It's one patch so you're free to remark the call sites where this
> happens. This is not a majorn concern at all.

Nearly twenty years ago, when the kernel was a lot smaller, a then
kernel luminary called Rusty Russell realized we needed to pay much
more attention to how we design APIs inside the kernel if we wanted it
to grow successfully.  He published his initial thoughts and gave talks
at both the kernel summit and OLS on it:

https://ozlabs.org/~rusty/index.cgi/tech/2008-03-18.html

The key point that's always stuck with me is "hard to misuse beats easy
to use". Later he came up with a rating scale (now known as the Rusty
API classification):

https://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html

and for chuckles and grins on April fools day he came up with a
negative rating ridiculing some of our dafter API choices:

https://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html

The point for this patch set is that the sizing of the original tpm_buf
interface scores 10/10 on the Rusty scale (it's impossible to get
wrong).  Simply threading size through the whole API, as this patch
does, may look like the right answer, but it causes a massive reduction
in API score.  In fact, since the buffer has to be sized not only
according to what goes in, but also what gets returned and this is
nowhere mentioned in the new documentation it scores -3 (read the
documentation and you can still get it wrong).  Now by mentioning the
sizing problems in the doc, you can probably get it up to +3 (read the
documentation and you'll get it right) but my question was not if you
got it wrong somewhere in the patch but whether we couldn't do a whole
lot better in terms of API score by designing a better API.

A key point about the 185 version of the TPM spec is that it's really
only a few commands that need larger buffers (the Post Quantum ML-KEM
keys) which doesn't apply to most of the in-kernel TPM callsites. 
Since tpm_buf_init takes the ordinal, we can actually tell at runtime
(or compile time if the ordinal is a constant) if the command would
need a larger buffer.  We can also tell from the TPM properties whether
the TPM itself can take a larger buffer, so for every current TPM we
could retain the original score 10/10 API and warn at runtime if there
might be a problem.  Then the larger keys seem to fit into 8k, so we
could still retain most of the original API properties of being
difficult to misuse simply by having an 8k size flag (which we could
ignore if the TPM doesn't support it) and warn at runtime if
tpm_buf_init sends an ordinal which might need a larger buffer.  At
worst we should be able to get to an API which scores 5/10 (do it right
or it will break at runtime).

Regards,

James

^ permalink raw reply

* Re: [PATCH bpf v3 2/2] bpf, libbpf: reject non-exclusive metadata maps in the signed loader
From: Daniel Borkmann @ 2026-05-29 12:25 UTC (permalink / raw)
  To: Alexei Starovoitov, KP Singh
  Cc: bpf, LSM List, Alexei Starovoitov, Kumar Kartikeya Dwivedi
In-Reply-To: <CAADnVQLJsvCfRxyLT-NJRubwSPTNd0k5bEp45Zyu9q1B_3oG+A@mail.gmail.com>

On 5/23/26 5:12 PM, Alexei Starovoitov wrote:
> On Fri, May 22, 2026 at 11:53 PM KP Singh <kpsingh@kernel.org> wrote:
>>
>> The loader verifies map->sha against the metadata hash in its
>> instructions. map->sha is calculated when BPF_OBJ_GET_INFO_BY_FD is called
>> on the frozen map.
>>
>> While the map is frozen, the loader must also ensure the map is
>> exclusive, as, without exclusivity, another BPF program with map access
>> can mutate the contents afterwards, so the check passes on stale data.
> 
> Hold on. How is this an issue? excl_prog_sha guarantees
> that only loader prog can use this map.
> Are you saying the same loader prog will use the same map
> for the 2nd time. Ok. I still don't see a problem.
> 
>> Place excl_prog_sha right after sha[] in struct bpf_map and have
>> gen_loader bail with -EINVAL when it is NULL, via BPF_PSEUDO_MAP_IDX at
>> fixed offset 32. The 8-byte read of the pointer field limits this to
>> 64-bit kernels; gen_loader needs target pointer size tracking to emit
>> the right sized read on 32-bit (follow-up).
> 
> I don't think we can go from maybe-racy to certainly-broken-on-32-bit.
> So only applied patch 1.

I've looked a bit more into it with regards to above question from Alexei
as well as the __bpf_md_ptr issue.

Imho, KP is correct that the extra check/enforcement is needed. So Alice
as a trusted signer generates the loader program (loader_insns + data_blob)
and signs it. The loader program contains the below enforcement to reject
if the metadata map was not exclusive.

Now the (untrusted) host that wants to load the program, it holds a signed
loader where they can't change a byte of it without breaking the signature.

However, it could simply omit excl_prog_hash on BPF_MAP_CREATE for the data
map (which would "normally" be bound exclusively to the loader).

Then check_map_prog_compatibility() enforcement is skipped on verifier side
given excl_prog_sha is not set. The loader loads fine, the fingerprint check
can then pass against a stale snapshot while a different program mangled the
data_blob underneath.

Regarding __bpf_md_ptr, I would solve it differently via fixed size, see below
together with the excl check coming before the signature check in the loader
and the build bug assertions, and a jmp not eq to 1.

  include/linux/bpf.h        |  1 +
  kernel/bpf/syscall.c       |  5 +++++
  tools/lib/bpf/gen_loader.c | 17 +++++++++++++++++
  3 files changed, 23 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cd191c5fdb0a..487f4653d8a6 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -295,6 +295,7 @@ struct bpf_map_owner {
  
  struct bpf_map {
  	u8 sha[SHA256_DIGEST_SIZE];
+	u32 excl;
  	const struct bpf_map_ops *ops;
  	struct bpf_map *inner_map_meta;
  #ifdef CONFIG_SECURITY
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 630d530782fe..37dacdbc5c01 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1572,6 +1572,11 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
  			err = -EFAULT;
  			goto free_map;
  		}
+
+		/* See libbpf: emit_signature_match() */
+		BUILD_BUG_ON(offsetof(struct bpf_map, excl) != SHA256_DIGEST_SIZE);
+		BUILD_BUG_ON(offsetof(struct bpf_map, sha)  != 0);
+		map->excl = 1;
  	} else if (attr->excl_prog_hash_size) {
  		err = -EINVAL;
  		goto free_map;
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
index bcea21c3b7bb..cd8d7df94ac7 100644
--- a/tools/lib/bpf/gen_loader.c
+++ b/tools/lib/bpf/gen_loader.c
@@ -586,6 +586,23 @@ static void emit_signature_match(struct bpf_gen *gen)
  	__s64 off;
  	int i;
  
+	/*
+	 * Reject if the metadata map is not exclusive. Without exclusivity
+	 * the cached map->sha[] verified above can be stale: another BPF
+	 * program with map access could have mutated the contents between
+	 * BPF_OBJ_GET_INFO_BY_FD and loader execution.
+	 */
+	emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX,
+					 0, 0, 0, 0));
+	emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, SHA256_DIGEST_LENGTH));
+	off = -(gen->insn_cur - gen->insn_start - gen->cleanup_label) / 8 - 2;
+	if (is_simm16(off)) {
+		emit(gen, BPF_MOV64_IMM(BPF_REG_7, -EINVAL));
+		emit(gen, BPF_JMP_IMM(BPF_JNE, BPF_REG_2, 1, off));
+	} else {
+		gen->error = -ERANGE;
+	}
+
  	for (i = 0; i < SHA256_DWORD_SIZE; i++) {
  		emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX,
  						 0, 0, 0, 0));
-- 
2.43.0

^ permalink raw reply related

* Re: [PATCH v4 1/2] rust: task: clarify comments on task UID accessors
From: Gary Guo @ 2026-05-29 12:17 UTC (permalink / raw)
  To: Alice Ryhl, Paul Moore, Serge Hallyn, Jonathan Corbet,
	Greg Kroah-Hartman, Shuah Khan, Alex Shi, Yanteng Si,
	Dongliang Mu
  Cc: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
	Jann Horn, linux-security-module, linux-doc, linux-kernel,
	rust-for-linux
In-Reply-To: <20260529-remove-task-euid-v4-1-07cbdf3af980@google.com>

On Fri May 29, 2026 at 10:33 AM BST, Alice Ryhl wrote:
> From: Jann Horn <jannh@google.com>
> 
> Linux has separate subjective and objective task credentials, see the
> comment above `struct cred`. Clarify which accessor functions operate on
> which set of credentials.
> 
> Also document that Task::euid() is a very weird operation. You can see how
> weird it is by grepping for task_euid() - binder is its only user.
> Task::euid() obtains the objective effective UID - it looks at the
> credentials of the task for purposes of acting on it as an object, but then
> accesses the effective UID (which the credentials.7 man page describes as
> "[...] used by the kernel to determine the permissions that the process
> will have when accessing shared resources [...]").
> 
> For context:
> Arguably, binder's use of task_euid() is a theoretical security problem,
> which only has no impact on Android because Android has no setuid binaries
> executable by apps.
> commit 29bc22ac5e5b ("binder: use euid from cred instead of using task")
> fixed that by removing that only user of task_euid(), but the fix got
> reverted in commit c21a80ca0684 ("binder: fix test regression due to
> sender_euid change") because some Android test started failing.
> 
> Signed-off-by: Jann Horn <jannh@google.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>

Reviewed-by: Gary Guo <gary@garyguo.net>

> ---
> Originally sent as:
> https://lore.kernel.org/r/20260212-rust-uid-v1-1-deff4214c766@google.com
> ---
>  rust/kernel/task.rs | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)


^ permalink raw reply

* Re: [REPORT] landlock: SCOPE_SIGNAL bypass via F_SETOWN to invoker pgid -> SIGIO/SIGKILL to non-sandboxed targets
From: Mickaël Salaün @ 2026-05-29 11:08 UTC (permalink / raw)
  To: hexlabsecurity
  Cc: Justin Suess, gnoack@google.com,
	linux-security-module@vger.kernel.org, stable@vger.kernel.org
In-Reply-To: <TSwHGN3I-u6p6xv7CqnvDOhR3la_kQWq0rdjBdA0gt30AsYLwddoxjCCFmqXcQMxWHS4ShULEp7sO_8HdFRGPLk30rIQHy3EurwJyrjP3NQ=@proton.me>

Hi,

Thanks for the report.  Could you please replace the reproducer code
with a proper kselftest?

That would need to be a new email patch (v3) as explained here:
https://docs.kernel.org/process/submitting-patches.html

Regards,
 Mickaël

On Fri, May 29, 2026 at 04:43:02AM +0000, hexlabsecurity@proton.me wrote:
> Thanks Justin -- much appreciated for reproducing on mic/next and for the
> Tested-by.
> 
> v2 below addresses your review:
>   - the commit message is trimmed to just the bug and the fix;
>   - the reproducer and the A/B verification are moved below the --- so
>     they become git notes, not part of the commit;
>   - added your Tested-by.
> 
> The fix hunk is unchanged. I agree the concise statement of the defect is
> "we fail to check the subject on fan-out signal types (PIDTYPE_PGID and
> PIDTYPE_SID, i.e. type > PIDTYPE_TGID)". The patch keeps the explicit
> PIDTYPE_PGID / PIDTYPE_SID test for readability and to stay robust if the
> enum is ever reordered -- happy to switch to "> PIDTYPE_TGID" if you
> prefer. I'll follow up separately on the erratum entry and a regression
> test, as you suggested.
> 
> Independent security researcher. HEXLAB SAS (registration pending) --
> Cali, Colombia.
> 
> Thanks,
> Bryam Vargas
> 
> ----- v2 patch (inline, plain text) -----
> 
> From 75f801309cd64f74d04ef86236bd973314dd7d94 Mon Sep 17 00:00:00 2001
> From: Bryam Vargas <hexlabsecurity@proton.me>
> Date: Thu, 28 May 2026 23:33:13 -0500
> Subject: [PATCH v2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to
>  invoker's pgid
> 
> A Landlock-restricted process can bypass LANDLOCK_SCOPE_SIGNAL on the
> SIGIO delivery path and deliver arbitrary signals (including SIGKILL via
> F_SETSIG) to non-Landlocked targets that share its pgid, by exploiting a
> producer-side cache-vs-live evaluation gap.
> 
> The SIGIO path in hook_file_send_sigiotask() consults a cached subject
> stored in landlock_file(file)->fown_subject at fcntl(F_SETOWN) time
> (via hook_file_set_fowner()), instead of evaluating the live Landlock
> domain of the invoking task at signal-send time. The capture is gated
> by control_current_fowner(), which returns false (skipping capture)
> when pid_task(fown->pid, fown->pid_type) is in current's thread group.
> 
> This is correct for PIDTYPE_TGID / PIDTYPE_PID, where the target is a
> single task sharing current's cred. It is unsafe for PIDTYPE_PGID and
> PIDTYPE_SID: when current is at the head of its pgid hlist -- the
> default placement after fork(), hlist_add_head_rcu() in kernel/fork.c --
> pid_task(pgid, PIDTYPE_PGID) resolves to current itself,
> same_thread_group(current, current) is true, the capture is skipped, and
> fown_subject.domain stays NULL. hook_file_send_sigiotask() then
> short-circuits at "if (!subject->domain) return 0;", letting the kernel
> fan the signal out to every member of the group, including tasks outside
> current's Landlock domain that SCOPE_SIGNAL is supposed to protect.
> 
> The direct kill() path (hook_task_kill) is unaffected: it evaluates
> current's live domain on every call. Only the cached SIGIO path is
> broken.
> 
> Tighten control_current_fowner() to apply the thread-group exemption
> only when the target identifies a single task whose Landlock cred is
> necessarily shared with current (PIDTYPE_TGID, PIDTYPE_PID). For
> PIDTYPE_PGID and PIDTYPE_SID, always capture the current Landlock
> subject so the consumer's scope check runs against every member of the
> group at delivery time.
> 
> Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
> Tested-by: Justin Suess <utilityemal77@gmail.com>
> Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
> ---
> v2: per review, the commit message is trimmed to the bug + the fix; the
>     reproducer and the A/B verification are moved below the --- so they
>     stay out of the commit. Added Tested-by. The hunk is unchanged from
>     v1 (v1 sent to security@kernel.org 2026-05-28, embargoed -- not yet
>     in a public archive).
> 
> Reproducer (ordinary unprivileged user; sandbox active in the child):
> 
>   int pfd[2]; pipe(pfd);
>   landlock_create_ruleset(&{.scoped = LANDLOCK_SCOPE_SIGNAL},
>                           sizeof(attr), 0);
>   prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
>   landlock_restrict_self(rfd, 0);
>   fcntl(pfd[0], F_SETSIG, SIGKILL);
>   fcntl(pfd[0], F_SETOWN, -getpgrp());           /* PIDTYPE_PGID */
>   fcntl(pfd[0], F_SETFL, O_ASYNC);
>   write(pfd[1], "X", 1);                         /* trigger SIGIO */
>   /* every pgid member receives SIGKILL, including the non-sandboxed
>    * parent / supervisor / sibling workers */
> 
> A/B-verified on a 6.12.90 lab kernel (same .config, only this hunk
> differs): pre-fix the sandboxed child's SIGKILL reaches the
> non-sandboxed parent (SCOPE_SIGNAL bypassed); post-fix it is blocked.
> hook_task_kill's direct-kill enforcement and the intra-thread-group
> F_SETOWN cases continue to work post-patch.
> 
>  security/landlock/fs.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index c1ecfe239032..edaa52572cbd 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -1909,6 +1909,18 @@ static bool control_current_fowner(struct fown_struct *const fown)
>  	if (!p)
>  		return true;
> 
> +	/*
> +	 * For PIDTYPE_PGID and PIDTYPE_SID, signal delivery fans out to
> +	 * every member of the group at SIGIO time. Even when pid_task()
> +	 * resolves to current itself (e.g., current is the pgid hlist
> +	 * head post-fork), non-current members of the group are still
> +	 * valid targets that must be checked by hook_file_send_sigiotask().
> +	 * Always capture the current subject for those types so the
> +	 * consumer scope check runs against the live fown_subject.
> +	 */
> +	if (fown->pid_type == PIDTYPE_PGID || fown->pid_type == PIDTYPE_SID)
> +		return true;
> +
>  	return !same_thread_group(p, current);
>  }
> --
> 2.43.0

^ permalink raw reply

* [PATCH v4 2/2] cred: delete task_euid()
From: Alice Ryhl @ 2026-05-29  9:33 UTC (permalink / raw)
  To: Paul Moore, Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman,
	Shuah Khan, Alex Shi, Yanteng Si, Dongliang Mu
  Cc: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
	Jann Horn, linux-security-module, linux-doc, linux-kernel,
	rust-for-linux, Alice Ryhl
In-Reply-To: <20260529-remove-task-euid-v4-0-07cbdf3af980@google.com>

task_euid() is a very weird operation. You can see how weird it is by
grepping for task_euid() - binder is its only user. task_euid() obtains
the objective effective UID - it looks at the credentials of the task
for purposes of acting on it as an object, but then accesses the
effective UID (which the credentials.7 man page describes as "[...] used
by the kernel to determine the permissions that the process will have
when accessing shared resources [...]").

Since usage in Binder has now been removed, get rid of the resulting
dead code.

Changes to the zh_CN translation was carried out with the help of
Gemini and Google Translate, and since adjusted as per Alex Shi's
feedback.

Suggested-by: Jann Horn <jannh@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 Documentation/security/credentials.rst                    |  6 ++----
 Documentation/translations/zh_CN/security/credentials.rst |  4 +---
 include/linux/cred.h                                      |  1 -
 rust/helpers/task.c                                       |  5 -----
 rust/kernel/task.rs                                       | 10 ----------
 5 files changed, 3 insertions(+), 23 deletions(-)

diff --git a/Documentation/security/credentials.rst b/Documentation/security/credentials.rst
index d0191c8b8060..81d3b5737d85 100644
--- a/Documentation/security/credentials.rst
+++ b/Documentation/security/credentials.rst
@@ -393,16 +393,14 @@ the credentials so obtained when they're finished with.
    The result of ``__task_cred()`` should not be passed directly to
    ``get_cred()`` as this may race with ``commit_cred()``.
 
-There are a couple of convenience functions to access bits of another task's
-credentials, hiding the RCU magic from the caller::
+There is a convenience function to access bits of another task's credentials,
+hiding the RCU magic from the caller::
 
 	uid_t task_uid(task)		Task's real UID
-	uid_t task_euid(task)		Task's effective UID
 
 If the caller is holding the RCU read lock at the time anyway, then::
 
 	__task_cred(task)->uid
-	__task_cred(task)->euid
 
 should be used instead.  Similarly, if multiple aspects of a task's credentials
 need to be accessed, RCU read lock should be used, ``__task_cred()`` called,
diff --git a/Documentation/translations/zh_CN/security/credentials.rst b/Documentation/translations/zh_CN/security/credentials.rst
index 88fcd9152ffe..20c8696f8198 100644
--- a/Documentation/translations/zh_CN/security/credentials.rst
+++ b/Documentation/translations/zh_CN/security/credentials.rst
@@ -337,15 +337,13 @@ const指针上操作，因此不需要进行类型转换，但需要临时放弃
    ``__task_cred()`` 的结果不应直接传递给 ``get_cred()`` ，
    因为这可能与 ``commit_cred()`` 发生竞争条件。
 
-还有一些方便的函数可以访问另一个任务凭据的特定部分，将RCU操作对调用方隐藏起来::
+有一个方便的函数可用于访问另一个任务凭据的特定部分，从而对调用方隐藏RCU机制::
 
 	uid_t task_uid(task)		Task's real UID
-	uid_t task_euid(task)		Task's effective UID
 
 如果调用方在此时已经持有RCU读锁，则应使用::
 
 	__task_cred(task)->uid
-	__task_cred(task)->euid
 
 类似地，如果需要访问任务凭据的多个方面，应使用RCU读锁，调用 ``__task_cred()``
 函数，将结果存储在临时指针中，然后从临时指针中调用凭据的各个方面，最后释放锁。
diff --git a/include/linux/cred.h b/include/linux/cred.h
index c6676265a985..6ef1750c93e2 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -371,7 +371,6 @@ DEFINE_FREE(put_cred, struct cred *, if (!IS_ERR_OR_NULL(_T)) put_cred(_T))
 })
 
 #define task_uid(task)		(task_cred_xxx((task), uid))
-#define task_euid(task)		(task_cred_xxx((task), euid))
 #define task_ucounts(task)	(task_cred_xxx((task), ucounts))
 
 #define current_cred_xxx(xxx)			\
diff --git a/rust/helpers/task.c b/rust/helpers/task.c
index c0e1a06ede78..b46b1433a67e 100644
--- a/rust/helpers/task.c
+++ b/rust/helpers/task.c
@@ -28,11 +28,6 @@ __rust_helper kuid_t rust_helper_task_uid(struct task_struct *task)
 	return task_uid(task);
 }
 
-__rust_helper kuid_t rust_helper_task_euid(struct task_struct *task)
-{
-	return task_euid(task);
-}
-
 #ifndef CONFIG_USER_NS
 __rust_helper uid_t rust_helper_from_kuid(struct user_namespace *to, kuid_t uid)
 {
diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index eabd65bfde12..c2b3457b700c 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -217,16 +217,6 @@ pub fn uid(&self) -> Kuid {
         Kuid::from_raw(unsafe { bindings::task_uid(self.as_ptr()) })
     }
 
-    /// Returns the objective effective UID of the given task.
-    ///
-    /// You should probably not be using this; the effective UID is normally
-    /// only relevant in subjective credentials.
-    #[inline]
-    pub fn euid(&self) -> Kuid {
-        // SAFETY: It's always safe to call `task_euid` on a valid task.
-        Kuid::from_raw(unsafe { bindings::task_euid(self.as_ptr()) })
-    }
-
     /// Determines whether the given task has pending signals.
     #[inline]
     pub fn signal_pending(&self) -> bool {

-- 
2.54.0.823.g6e5bcc1fc9-goog


^ permalink raw reply related

* [PATCH v4 1/2] rust: task: clarify comments on task UID accessors
From: Alice Ryhl @ 2026-05-29  9:33 UTC (permalink / raw)
  To: Paul Moore, Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman,
	Shuah Khan, Alex Shi, Yanteng Si, Dongliang Mu
  Cc: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
	Jann Horn, linux-security-module, linux-doc, linux-kernel,
	rust-for-linux, Alice Ryhl
In-Reply-To: <20260529-remove-task-euid-v4-0-07cbdf3af980@google.com>

From: Jann Horn <jannh@google.com>

Linux has separate subjective and objective task credentials, see the
comment above `struct cred`. Clarify which accessor functions operate on
which set of credentials.

Also document that Task::euid() is a very weird operation. You can see how
weird it is by grepping for task_euid() - binder is its only user.
Task::euid() obtains the objective effective UID - it looks at the
credentials of the task for purposes of acting on it as an object, but then
accesses the effective UID (which the credentials.7 man page describes as
"[...] used by the kernel to determine the permissions that the process
will have when accessing shared resources [...]").

For context:
Arguably, binder's use of task_euid() is a theoretical security problem,
which only has no impact on Android because Android has no setuid binaries
executable by apps.
commit 29bc22ac5e5b ("binder: use euid from cred instead of using task")
fixed that by removing that only user of task_euid(), but the fix got
reverted in commit c21a80ca0684 ("binder: fix test regression due to
sender_euid change") because some Android test started failing.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Originally sent as:
https://lore.kernel.org/r/20260212-rust-uid-v1-1-deff4214c766@google.com
---
 rust/kernel/task.rs | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index 38273f4eedb5..eabd65bfde12 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -210,14 +210,17 @@ pub fn pid(&self) -> Pid {
         unsafe { *ptr::addr_of!((*self.as_ptr()).pid) }
     }

-    /// Returns the UID of the given task.
+    /// Returns the objective real UID of the given task.
     #[inline]
     pub fn uid(&self) -> Kuid {
         // SAFETY: It's always safe to call `task_uid` on a valid task.
         Kuid::from_raw(unsafe { bindings::task_uid(self.as_ptr()) })
     }

-    /// Returns the effective UID of the given task.
+    /// Returns the objective effective UID of the given task.
+    ///
+    /// You should probably not be using this; the effective UID is normally
+    /// only relevant in subjective credentials.
     #[inline]
     pub fn euid(&self) -> Kuid {
         // SAFETY: It's always safe to call `task_euid` on a valid task.
@@ -371,7 +374,7 @@ fn eq(&self, other: &Self) -> bool {
 impl Eq for Task {}

 impl Kuid {
-    /// Get the current euid.
+    /// Get the current subjective effective UID.
     #[inline]
     pub fn current_euid() -> Kuid {
         // SAFETY: Just an FFI call.

-- 
2.54.0.823.g6e5bcc1fc9-goog

^ permalink raw reply related

* [PATCH v4 0/2] Delete task_euid()
From: Alice Ryhl @ 2026-05-29  9:33 UTC (permalink / raw)
  To: Paul Moore, Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman,
	Shuah Khan, Alex Shi, Yanteng Si, Dongliang Mu
  Cc: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
	Jann Horn, linux-security-module, linux-doc, linux-kernel,
	rust-for-linux, Alice Ryhl

The task_euid() method is a very weird method, and Binder was the only
user. As of commit 65b672152289 ("binder: use current_euid() for
transaction sender identity") Binder doesn't use task_euid() anymore,
so we can delete this method.

My suggestion would be to merge this through the LSM tree.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Changes in v4:
- Reword 'euid' -> 'effective UID' in 'Kuid::current_euid()' docs.
- Link to v3: https://lore.kernel.org/r/20260507-remove-task-euid-v3-0-27f22f335c2c@google.com

Changes in v3:
- Include 'task' clarification commit in series.
- Rebase and resend.
- Link to v2: https://lore.kernel.org/r/20260227-remove-task-euid-v2-1-9a9c80a82eb6@google.com

Changes in v2:
- Update translation as per Alex Shi.
- Pick up Reviewed-by Gary.
- Update commit title to use cred: prefix.
- Link to v1: https://lore.kernel.org/r/20260219-remove-task-euid-v1-1-904060826e07@google.com

---
Alice Ryhl (1):
      cred: delete task_euid()

Jann Horn (1):
      rust: task: clarify comments on task UID accessors

 Documentation/security/credentials.rst                    |  6 ++----
 Documentation/translations/zh_CN/security/credentials.rst |  4 +---
 include/linux/cred.h                                      |  1 -
 rust/helpers/task.c                                       |  5 -----
 rust/kernel/task.rs                                       | 11 ++---------
 5 files changed, 5 insertions(+), 22 deletions(-)
---
base-commit: 7fd2df204f342fc17d1a0bfcd474b24232fb0f32
change-id: 20260219-remove-task-euid-19e4b00beebe

Best regards,
-- 
Alice Ryhl <aliceryhl@google.com>


^ permalink raw reply

* Re: [REPORT] landlock: SCOPE_SIGNAL bypass via F_SETOWN to invoker pgid -> SIGIO/SIGKILL to non-sandboxed targets
From: hexlabsecurity @ 2026-05-29  4:43 UTC (permalink / raw)
  To: Justin Suess
  Cc: mic@digikod.net, gnoack@google.com,
	linux-security-module@vger.kernel.org, stable@vger.kernel.org

Thanks Justin -- much appreciated for reproducing on mic/next and for the
Tested-by.

v2 below addresses your review:
  - the commit message is trimmed to just the bug and the fix;
  - the reproducer and the A/B verification are moved below the --- so
    they become git notes, not part of the commit;
  - added your Tested-by.

The fix hunk is unchanged. I agree the concise statement of the defect is
"we fail to check the subject on fan-out signal types (PIDTYPE_PGID and
PIDTYPE_SID, i.e. type > PIDTYPE_TGID)". The patch keeps the explicit
PIDTYPE_PGID / PIDTYPE_SID test for readability and to stay robust if the
enum is ever reordered -- happy to switch to "> PIDTYPE_TGID" if you
prefer. I'll follow up separately on the erratum entry and a regression
test, as you suggested.

Independent security researcher. HEXLAB SAS (registration pending) --
Cali, Colombia.

Thanks,
Bryam Vargas

----- v2 patch (inline, plain text) -----

From 75f801309cd64f74d04ef86236bd973314dd7d94 Mon Sep 17 00:00:00 2001
From: Bryam Vargas <hexlabsecurity@proton.me>
Date: Thu, 28 May 2026 23:33:13 -0500
Subject: [PATCH v2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to
 invoker's pgid

A Landlock-restricted process can bypass LANDLOCK_SCOPE_SIGNAL on the
SIGIO delivery path and deliver arbitrary signals (including SIGKILL via
F_SETSIG) to non-Landlocked targets that share its pgid, by exploiting a
producer-side cache-vs-live evaluation gap.

The SIGIO path in hook_file_send_sigiotask() consults a cached subject
stored in landlock_file(file)->fown_subject at fcntl(F_SETOWN) time
(via hook_file_set_fowner()), instead of evaluating the live Landlock
domain of the invoking task at signal-send time. The capture is gated
by control_current_fowner(), which returns false (skipping capture)
when pid_task(fown->pid, fown->pid_type) is in current's thread group.

This is correct for PIDTYPE_TGID / PIDTYPE_PID, where the target is a
single task sharing current's cred. It is unsafe for PIDTYPE_PGID and
PIDTYPE_SID: when current is at the head of its pgid hlist -- the
default placement after fork(), hlist_add_head_rcu() in kernel/fork.c --
pid_task(pgid, PIDTYPE_PGID) resolves to current itself,
same_thread_group(current, current) is true, the capture is skipped, and
fown_subject.domain stays NULL. hook_file_send_sigiotask() then
short-circuits at "if (!subject->domain) return 0;", letting the kernel
fan the signal out to every member of the group, including tasks outside
current's Landlock domain that SCOPE_SIGNAL is supposed to protect.

The direct kill() path (hook_task_kill) is unaffected: it evaluates
current's live domain on every call. Only the cached SIGIO path is
broken.

Tighten control_current_fowner() to apply the thread-group exemption
only when the target identifies a single task whose Landlock cred is
necessarily shared with current (PIDTYPE_TGID, PIDTYPE_PID). For
PIDTYPE_PGID and PIDTYPE_SID, always capture the current Landlock
subject so the consumer's scope check runs against every member of the
group at delivery time.

Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
Tested-by: Justin Suess <utilityemal77@gmail.com>
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
v2: per review, the commit message is trimmed to the bug + the fix; the
    reproducer and the A/B verification are moved below the --- so they
    stay out of the commit. Added Tested-by. The hunk is unchanged from
    v1 (v1 sent to security@kernel.org 2026-05-28, embargoed -- not yet
    in a public archive).

Reproducer (ordinary unprivileged user; sandbox active in the child):

  int pfd[2]; pipe(pfd);
  landlock_create_ruleset(&{.scoped = LANDLOCK_SCOPE_SIGNAL},
                          sizeof(attr), 0);
  prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
  landlock_restrict_self(rfd, 0);
  fcntl(pfd[0], F_SETSIG, SIGKILL);
  fcntl(pfd[0], F_SETOWN, -getpgrp());           /* PIDTYPE_PGID */
  fcntl(pfd[0], F_SETFL, O_ASYNC);
  write(pfd[1], "X", 1);                         /* trigger SIGIO */
  /* every pgid member receives SIGKILL, including the non-sandboxed
   * parent / supervisor / sibling workers */

A/B-verified on a 6.12.90 lab kernel (same .config, only this hunk
differs): pre-fix the sandboxed child's SIGKILL reaches the
non-sandboxed parent (SCOPE_SIGNAL bypassed); post-fix it is blocked.
hook_task_kill's direct-kill enforcement and the intra-thread-group
F_SETOWN cases continue to work post-patch.

 security/landlock/fs.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index c1ecfe239032..edaa52572cbd 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -1909,6 +1909,18 @@ static bool control_current_fowner(struct fown_struct *const fown)
 	if (!p)
 		return true;

+	/*
+	 * For PIDTYPE_PGID and PIDTYPE_SID, signal delivery fans out to
+	 * every member of the group at SIGIO time. Even when pid_task()
+	 * resolves to current itself (e.g., current is the pgid hlist
+	 * head post-fork), non-current members of the group are still
+	 * valid targets that must be checked by hook_file_send_sigiotask().
+	 * Always capture the current subject for those types so the
+	 * consumer scope check runs against the live fown_subject.
+	 */
+	if (fown->pid_type == PIDTYPE_PGID || fown->pid_type == PIDTYPE_SID)
+		return true;
+
 	return !same_thread_group(p, current);
 }
--
2.43.0

^ permalink raw reply related

* [PATCH] KEYS: Use acquire when reading state in keyring search
From: Gui-Dong Han @ 2026-05-29  3:34 UTC (permalink / raw)
  To: keyrings, dhowells, jarkko
  Cc: ebiggers, linux-security-module, linux-kernel, baijiaju1990,
	Gui-Dong Han

The negative-key race fix added release/acquire ordering for key use.

Publish payload before state; read state before payload.

keyring_search_iterator() still uses READ_ONCE() before match callbacks.
An asymmetric match callback calls asymmetric_key_ids(), which reads
key->payload.data[asym_key_ids].

Use key_read_state() there to complete that ordering.

Fixes: 363b02dab09b ("KEYS: Fix race between updating and finding a negative key")
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
---
Found by auditing READ_ONCE() used for synchronization.
A similar fix can be found in 8df672bfe3ec.
---
 security/keys/keyring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index b39038f7dd31..243fb1636f10 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -576,7 +576,7 @@ static int keyring_search_iterator(const void *object, void *iterator_data)
 	struct keyring_search_context *ctx = iterator_data;
 	const struct key *key = keyring_ptr_to_key(object);
 	unsigned long kflags = READ_ONCE(key->flags);
-	short state = READ_ONCE(key->state);
+	short state = key_read_state(key);
 
 	kenter("{%d}", key->serial);
 
-- 
2.34.1


^ permalink raw reply related

* [BUG] apparmor: AA_BUG aa_policy_destroy on aa_alloc_profile error path
From: Farhad Alemi @ 2026-05-29  3:32 UTC (permalink / raw)
  To: John Johansen
  Cc: falemi, Tiffany Bao, Adam Doupé, Fish Wang,
	Yan Shoshitaishvili, Paul Moore, James Morris, Serge E. Hallyn,
	apparmor, linux-security-module, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3589 bytes --]

Hello John and the AppArmor team,

I am reporting an AppArmor AA_BUG WARN in aa_policy_destroy() found
by syzkaller as part of research at the SEFCOM Lab at ASU.

Summary:
A write(2) to /proc/<pid>/attr/<lsm>/current that drives the
aa_change_hat() -> aa_new_learning_profile() -> aa_alloc_null() ->
aa_alloc_profile() chain takes the error-rollback path at
security/apparmor/policy.c:409 (aa_alloc_profile()'s `fail:` label
calling aa_free_profile(profile)). aa_free_profile() then calls
aa_policy_destroy(&profile->base) at security/apparmor/policy.c:327,
which trips its first AA_BUG at security/apparmor/lib.c:509:

  void aa_policy_destroy(struct aa_policy *policy)
  {
          AA_BUG(on_list_rcu(&policy->profiles));   <-- :509
          AA_BUG(on_list_rcu(&policy->list));
          ...
  }

  /* security/apparmor/include/policy.h:60 */
  #define on_list_rcu(X) (!list_empty(X) && (X)->prev != LIST_POISON2)

The WARN reproduces the macro's condition verbatim (the kernel prints
the full stringified expression including the LIST_POISON2 numeric
0x122 + 0xdead000000000000UL); see crash-report.txt for the full
header.

Observed on:
- Linux v7.1-rc3-200-g70eda68668d1-dirty (the only local dirty file
  is drivers/tty/serial/serial_core.c, a console guard our fuzzing
  harness uses, unrelated to security/apparmor/), x86_64, QEMU Q35
- AA_BUG asserts enabled + panic_on_warn (the crash tail prints
  "Kernel panic - not syncing: kernel: panic_on_warn set")
- Source inspection of linus/master at commit e8c2f9fdadee
  (v7.1-rc4-754-ge8c2f9fdadee) shows the buggy structure is
  unchanged: security/apparmor/lib.c:509 still does
  AA_BUG(on_list_rcu(&policy->profiles)); aa_alloc_profile()'s fail
  path at security/apparmor/policy.c:409 still calls
  aa_free_profile(profile); aa_free_profile() at policy.c:327 still
  calls aa_policy_destroy(&profile->base). As no reproducer is available
  for this seed, I have not re-triggered the crash against e8c2f9fdadee.

Expected behavior:
Either aa_alloc_profile()'s rollback path must guarantee
profile->base.profiles is empty (or list_del'd so prev == LIST_POISON2)
before calling aa_free_profile(), or aa_policy_destroy()'s AA_BUG
should be softened to a WARN_ON-and-drain so it does not panic on an
alloc-rollback path. The maintainers are best placed to choose which
side of the contract owns this.

Reproducer:
A standalone .syz or C reproducer was not produced for this seed;
the crash fired during automated /proc/<pid>/attr/* fuzzing. The
console report is attached as crash-report.txt.

Novelty check:
I searched the syzbot dashboard's upstream open, fixed, stable, and
invalid (per-subsystem apparmor) namespaces; the Android dashboard;
the marc.info linux-security-module archive; and the complete
apparmor@lists.ubuntu.com list archive (2010 through 2026, full
message bodies), for "aa_policy_destroy", "on_list_rcu(&policy->
profiles)", "aa_alloc_profile" + "WARNING", and "AA_BUG" +
"policy->profiles". I did not find a prior report of this crash. The
three apparmor-titled entries in the syzbot invalid namespace are in
different functions (apparmor_sk_free_security UAF, aa_label_sk_perm
UAF, apparmor_file_open data-race). The only aa_policy_destroy
mentions on the AppArmor list are a 2022 "Fix memleak in alloc_ns()"
patch (a different aa_policy_destroy(&ns->base) call site), and there
is no occurrence of on_list_rcu(&policy->profiles) anywhere in the
list history.

I appreciate your time and consideration, and I'm grateful for your
work on this subsystem. I'd be glad to test any candidate patches.

Regards,

[-- Attachment #2: crash-report.txt --]
[-- Type: text/plain, Size: 8182 bytes --]

 </TASK>
------------[ cut here ]------------
AppArmor WARN aa_policy_destroy: (((!list_empty(&policy->profiles) && (&policy->profiles)->prev != ((void *) 0x122 + (0xdead000000000000UL))))): 
WARNING: security/apparmor/lib.c:509 at aa_policy_destroy+0x169/0x1c0 security/apparmor/lib.c:509, CPU#0: syz.3.739/13898
Modules linked in:
CPU: 0 UID: 0 PID: 13898 Comm: syz.3.739 Not tainted 7.1.0-rc3-00200-g70eda68668d1-dirty #1 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:aa_policy_destroy+0x170/0x1c0 security/apparmor/lib.c:509
Code: 85 ed 7e 4d e8 c1 9a dc fd 5b 41 5c 41 5e 41 5f 5d c3 cc cc cc cc cc e8 ae 9a dc fd 48 8d 3d 87 1c 0b 05 48 c7 c6 b8 a7 82 87 <67> 48 0f b9 3a e9 04 ff ff ff e8 91 9a dc fd 48 8d 3d 7a 1c 0b 05
RSP: 0018:ffffc9000141f500 EFLAGS: 00010293
RAX: ffffffff83a572b2 RBX: ffff88811907a400 RCX: ffff88812f778000
RDX: 0000000000000000 RSI: ffffffff8782a7b8 RDI: ffffffff88b08f40
RBP: 0000000000000cc0 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: dffffc0000000000 R11: fffffbfff100a27f R12: dead000000000122
R13: ffff88811907a400 R14: ffff88811907a428 R15: dffffc0000000000
FS:  00007f51fd2d76c0(0000) GS:ffff8882ab6b6000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f51fe8cfe10 CR3: 000000011b6ea000 CR4: 0000000000750ef0
PKRU: 80000000
Call Trace:
 <TASK>
 aa_free_profile+0xa2/0x9f0 security/apparmor/policy.c:327
 aa_alloc_profile+0x1f1/0x3f0 security/apparmor/policy.c:409
 aa_alloc_null+0x2d/0x530 security/apparmor/policy.c:690
 aa_new_learning_profile+0x226/0x4e0 security/apparmor/policy.c:767
 build_change_hat+0x292/0x400 security/apparmor/domain.c:1079
 change_hat security/apparmor/domain.c:1193 [inline]
 aa_change_hat+0x1177/0x2fb0 security/apparmor/domain.c:1269
 aa_setprocattr_changehat+0x4a6/0x5b0 security/apparmor/procattr.c:138
 do_setattr+0x548/0x6a0
 proc_pid_attr_write+0x5d1/0x630 fs/proc/base.c:2844
 vfs_write+0x29f/0xb90 fs/read_write.c:686
 ksys_write+0x155/0x270 fs/read_write.c:740
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x15f/0x560 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f51fe88778d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f51fd2d7018 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f51feb15fa0 RCX: 00007f51fe88778d
RDX: 0000000000000022 RSI: 00002000000000c0 RDI: 0000000000000003
RBP: 00007f51fd2d7080 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007f51feb16038 R14: 00007f51feb15fa0 R15: 00007ffc4916b870
 </TASK>
----------------
Code disassembly (best guess):
   0:	85 ed                	test   %ebp,%ebp
   2:	7e 4d                	jle    0x51
   4:	e8 c1 9a dc fd       	call   0xfddc9aca
   9:	5b                   	pop    %rbx
   a:	41 5c                	pop    %r12
   c:	41 5e                	pop    %r14
   e:	41 5f                	pop    %r15
  10:	5d                   	pop    %rbp
  11:	c3                   	ret
  12:	cc                   	int3
  13:	cc                   	int3
  14:	cc                   	int3
  15:	cc                   	int3
  16:	cc                   	int3
  17:	e8 ae 9a dc fd       	call   0xfddc9aca
  1c:	48 8d 3d 87 1c 0b 05 	lea    0x50b1c87(%rip),%rdi        # 0x50b1caa
  23:	48 c7 c6 b8 a7 82 87 	mov    $0xffffffff8782a7b8,%rsi
* 2a:	67 48 0f b9 3a       	ud1    (%edx),%rdi <-- trapping instruction
  2f:	e9 04 ff ff ff       	jmp    0xffffff38
  34:	e8 91 9a dc fd       	call   0xfddc9aca
  39:	48 8d 3d 7a 1c 0b 05 	lea    0x50b1c7a(%rip),%rdi        # 0x50b1cba

<<<<<<<<<<<<<<< tail report >>>>>>>>>>>>>>>

Modules linked in:
CPU: 0 UID: 0 PID: 13898 Comm: syz.3.739 Not tainted 7.1.0-rc3-00200-g70eda68668d1-dirty #1 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:aa_policy_destroy+0x170/0x1c0
Code: 85 ed 7e 4d e8 c1 9a dc fd 5b 41 5c 41 5e 41 5f 5d c3 cc cc cc cc cc e8 ae 9a dc fd 48 8d 3d 87 1c 0b 05 48 c7 c6 b8 a7 82 87 <67> 48 0f b9 3a e9 04 ff ff ff e8 91 9a dc fd 48 8d 3d 7a 1c 0b 05
RSP: 0018:ffffc9000141f500 EFLAGS: 00010293
RAX: ffffffff83a572b2 RBX: ffff88811907a400 RCX: ffff88812f778000
RDX: 0000000000000000 RSI: ffffffff8782a7b8 RDI: ffffffff88b08f40
RBP: 0000000000000cc0 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: dffffc0000000000 R11: fffffbfff100a27f R12: dead000000000122
R13: ffff88811907a400 R14: ffff88811907a428 R15: dffffc0000000000
FS:  00007f51fd2d76c0(0000) GS:ffff8882ab6b6000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f51fe8cfe10 CR3: 000000011b6ea000 CR4: 0000000000750ef0
PKRU: 80000000
Call Trace:
 <TASK>
 aa_free_profile+0xa2/0x9f0
 aa_alloc_profile+0x1f1/0x3f0
 aa_alloc_null+0x2d/0x530
 aa_new_learning_profile+0x226/0x4e0
 build_change_hat+0x292/0x400
 aa_change_hat+0x1177/0x2fb0
 aa_setprocattr_changehat+0x4a6/0x5b0
 do_setattr+0x548/0x6a0
 proc_pid_attr_write+0x5d1/0x630
 vfs_write+0x29f/0xb90
 ksys_write+0x155/0x270
 do_syscall_64+0x15f/0x560
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f51fe88778d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f51fd2d7018 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f51feb15fa0 RCX: 00007f51fe88778d
RDX: 0000000000000022 RSI: 00002000000000c0 RDI: 0000000000000003
RBP: 00007f51fd2d7080 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007f51feb16038 R14: 00007f51feb15fa0 R15: 00007ffc4916b870
 </TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 0 UID: 0 PID: 13898 Comm: syz.3.739 Not tainted 7.1.0-rc3-00200-g70eda68668d1-dirty #1 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
 <TASK>
 vpanic+0x571/0xa60
 panic+0xca/0xd0
 __warn+0x31a/0x4d0
 __report_bug+0x29a/0x540
 report_bug_entry+0x19a/0x290
 handle_bug+0xce/0x200
 exc_invalid_op+0x1a/0x50
 asm_exc_invalid_op+0x1a/0x20
RIP: 0010:aa_policy_destroy+0x170/0x1c0
Code: 85 ed 7e 4d e8 c1 9a dc fd 5b 41 5c 41 5e 41 5f 5d c3 cc cc cc cc cc e8 ae 9a dc fd 48 8d 3d 87 1c 0b 05 48 c7 c6 b8 a7 82 87 <67> 48 0f b9 3a e9 04 ff ff ff e8 91 9a dc fd 48 8d 3d 7a 1c 0b 05
RSP: 0018:ffffc9000141f500 EFLAGS: 00010293
RAX: ffffffff83a572b2 RBX: ffff88811907a400 RCX: ffff88812f778000
RDX: 0000000000000000 RSI: ffffffff8782a7b8 RDI: ffffffff88b08f40
RBP: 0000000000000cc0 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: dffffc0000000000 R11: fffffbfff100a27f R12: dead000000000122
R13: ffff88811907a400 R14: ffff88811907a428 R15: dffffc0000000000
 aa_free_profile+0xa2/0x9f0
 aa_alloc_profile+0x1f1/0x3f0
 aa_alloc_null+0x2d/0x530
 aa_new_learning_profile+0x226/0x4e0
 build_change_hat+0x292/0x400
 aa_change_hat+0x1177/0x2fb0
 aa_setprocattr_changehat+0x4a6/0x5b0
 do_setattr+0x548/0x6a0
 proc_pid_attr_write+0x5d1/0x630
 vfs_write+0x29f/0xb90
 ksys_write+0x155/0x270
 do_syscall_64+0x15f/0x560
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f51fe88778d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f51fd2d7018 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f51feb15fa0 RCX: 00007f51fe88778d
RDX: 0000000000000022 RSI: 00002000000000c0 RDI: 0000000000000003
RBP: 00007f51fd2d7080 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007f51feb16038 R14: 00007f51feb15fa0 R15: 00007ffc4916b870
 </TASK>
Kernel Offset: disabled
Rebooting in 86400 seconds..

<<<<<<<<<<<<<<< tail report >>>>>>>>>>>>>>>


^ permalink raw reply

* Re: [PATCH] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to invoker's pgid
From: Justin Suess @ 2026-05-29  3:25 UTC (permalink / raw)
  To: hexlabsecurity
  Cc: mic@digikod.net, gnoack@google.com,
	linux-security-module@vger.kernel.org, stable@vger.kernel.org
In-Reply-To: <cFjmBkbTY-D5pYl66NixBeqbhWBzS7kBEUHCWbhTQwkiuvKg8xNkSEf9rYqDQiD76er1gK8Q6t1YOJ4nIPuvILuwG42d8_rfMZpQ5VmJru0=@proton.me>

On Thu, May 28, 2026 at 09:21:50PM +0000, hexlabsecurity@proton.me wrote:
> From 22a0086b44beaaef01883e047dd4a8b8bc3153e9 Mon Sep 17 00:00:00 2001
> From: Bryam Vargas <hexlabsecurity@proton.me>
> Date: Thu, 28 May 2026 01:30:00 -0500
> Subject: [PATCH] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to
>  invoker's pgid
> 
> A Landlock-restricted process can bypass LANDLOCK_SCOPE_SIGNAL on the
> SIGIO delivery path and deliver arbitrary signals (including SIGKILL via
> F_SETSIG) to non-Landlocked targets that share its pgid, by exploiting a
> producer-side cache-vs-live evaluation gap.
> 
> The SIGIO path in hook_file_send_sigiotask() consults a cached subject
> stored in landlock_file(file)->fown_subject at fcntl(F_SETOWN) time
> (via hook_file_set_fowner()), instead of evaluating the live Landlock
> domain of the invoking task at signal-send time. The capture is gated
> by control_current_fowner(), which returns false (skipping capture)
> when pid_task(fown->pid, fown->pid_type) is in current's thread group.
> 
> This is correct for PIDTYPE_TGID / PIDTYPE_PID, where the target is a
> single thread or thread-group leader sharing current's cred. It is
> unsafe for PIDTYPE_PGID and PIDTYPE_SID: when current is at the head
> of its pgid hlist -- the default placement after fork(),
> hlist_add_head_rcu() in kernel/fork.c -- pid_task(pgid, PIDTYPE_PGID)
> resolves to current itself, same_thread_group(current, current) is
> true, the capture is skipped, and fown_subject.domain stays NULL.
> 
> hook_file_send_sigiotask() then short-circuits at
> "if (!subject->domain) return 0;", allowing the kernel to fan the
> signal out to every member of the group, including tasks outside
> current's Landlock domain that the SCOPE_SIGNAL contract is supposed
> to protect.
> 
> The direct kill() path (hook_task_kill) is unaffected: it evaluates
> current's live domain on every call. Only the cached SIGIO path is
> broken.
> 
> Repro (ordinary unprivileged user; sandbox active in the child):
> 
>   int pfd[2]; pipe(pfd);
>   landlock_create_ruleset(&{.scoped = LANDLOCK_SCOPE_SIGNAL},
>                           sizeof(attr), 0);
>   prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
>   landlock_restrict_self(rfd, 0);
>   fcntl(pfd[0], F_SETSIG, SIGKILL);
>   fcntl(pfd[0], F_SETOWN, -getpgrp());           /* PIDTYPE_PGID */
>   fcntl(pfd[0], F_SETFL, O_ASYNC);
>   write(pfd[1], "X", 1);                         /* trigger SIGIO  */
>   /* every pgid member receives SIGKILL, including non-sandboxed
>    * parent / supervisor / sibling workers */
>
I was able to reproduce this on mic/next.

Great catch!

> Tighten control_current_fowner() to apply the thread-group exemption
> only when the target identifies a SINGLE task whose Landlock cred is
> necessarily shared with current (PIDTYPE_TGID, PIDTYPE_PID). For
> PIDTYPE_PGID and PIDTYPE_SID, always capture the current Landlock
> subject so the consumer's scope check runs against every member of
> the group at delivery time.
> 
> Empirically A/B-verified on a 6.12.90 lab kernel (same .config, only
> the patch hunk differs): pre-fix build exits with "BUG PRESENT --
> SCOPE_SIGNAL BYPASSED", post-fix build exits with "SANDBOX HELD".
> hook_task_kill's direct-kill enforcement and the intra-thread-group
> F_SETOWN cases continue to work post-patch.
> 
> Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
> Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
> ---
>  security/landlock/fs.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index c1ecfe239032..edaa52572cbd 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -1909,6 +1909,18 @@ static bool control_current_fowner(struct fown_struct *const fown)
>  	if (!p)
>  		return true;
> 
> +	/*
> +	 * For PIDTYPE_PGID and PIDTYPE_SID, signal delivery fans out to
> +	 * every member of the group at SIGIO time. Even when pid_task()
> +	 * resolves to current itself (e.g., current is the pgid hlist
> +	 * head post-fork), non-current members of the group are still
> +	 * valid targets that must be checked by hook_file_send_sigiotask().
> +	 * Always capture the current subject for those types so the
> +	 * consumer scope check runs against the live fown_subject.
> +	 */
> +	if (fown->pid_type == PIDTYPE_PGID || fown->pid_type == PIDTYPE_SID)
> +		return true;
This seems right.

So basically we are failing to check the subject on fan-out
signals where type > PIDTYPE_TGID (ie PIDTYPE_PGID/SID).

But this fix seems good as is to me and closed the reproducer hole in my
test. Unless there are some edge cases I'm missing.

The commit message could use some cleanup and shortening. No need to
include the reproducer (though it was helpful) and the "BUG_PRESENT"/
"SANDBOX_HELD"/ AB testing stuff. Just explain the bug and what
it fixes :)

You can add the reproducer and stuff below the --- in the patch and
above the diffstat in the future to make it part of the git notes and
not the actual commit.

That way you can add anything else that doesn't belong in the actual
commit but is important for context.

This may need an erratum entry and a regression test in the future,
but that can be done seperately.

Again great job!

Tested-by: Justin Suess <utilityemal77@gmail.com>
> +
>  	return !same_thread_group(p, current);
>  }
> 
> --
> 2.43.0
> 

^ permalink raw reply

* Re: [PATCH v9 4/9] samples/landlock: Add quiet flag support to sandboxer
From: Justin Suess @ 2026-05-29  2:34 UTC (permalink / raw)
  To: Tingmao Wang
  Cc: Mickaël Salaün, Günther Noack, Jan Kara,
	Abhinav Saxena, linux-security-module
In-Reply-To: <7d5ad9631a51df6c2b857ff9c0122ff8ed491b7d.1779843375.git.m@maowtm.org>

On Wed, May 27, 2026 at 02:01:14AM +0100, Tingmao Wang wrote:
> Adds ability to set which access bits to quiet via LL_*_QUIET_ACCESS (FS,
> NET or SCOPED), and attach quiet flags to individual objects via
> LL_*_QUIET for FS and NET.
> 
> Signed-off-by: Tingmao Wang <m@maowtm.org>
> ---
> 
> Changes in v9:
> - Add udp connect / bind quiet flag support
> 
> Changes in v8:
> - Rebase on top of mic/next
> - populate_ruleset_net() already does not require the env var to be
>   present, so remove redundant comment and check above
>   populate_ruleset_net(ENV_NET_QUIET_NAME, ...).
> 
> Changes in v6:
> - Make populate_ruleset_{fs,net} take a flags argument instead of a bool
>   quiet (suggested by Justin Suess)
> - Fix if braces style
> 
> Changes in v3:
> - Minor change to the above commit message.
> 
> Changes in v2:
> - Added new environment variables to control which quiet access bits to
>   set on the rule, and populate quiet_access_* from it.
> - Added support for quieting net rules and scoped access.  Renamed patch
>   title.
> - Increment ABI version
> 
>  samples/landlock/sandboxer.c | 134 ++++++++++++++++++++++++++++++++---
>  1 file changed, 123 insertions(+), 11 deletions(-)
> 
> diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
> index 94e399e6b146..74ee53afed6a 100644
> --- a/samples/landlock/sandboxer.c
> +++ b/samples/landlock/sandboxer.c
> @@ -58,9 +58,14 @@ static inline int landlock_restrict_self(const int ruleset_fd,
>  
>  #define ENV_FS_RO_NAME "LL_FS_RO"
>  #define ENV_FS_RW_NAME "LL_FS_RW"
> +#define ENV_FS_QUIET_NAME "LL_FS_QUIET"
> +#define ENV_FS_QUIET_ACCESS_NAME "LL_FS_QUIET_ACCESS"
>  #define ENV_TCP_BIND_NAME "LL_TCP_BIND"
>  #define ENV_TCP_CONNECT_NAME "LL_TCP_CONNECT"
> +#define ENV_NET_QUIET_NAME "LL_NET_QUIET"
> +#define ENV_NET_QUIET_ACCESS_NAME "LL_NET_QUIET_ACCESS"
>  #define ENV_SCOPED_NAME "LL_SCOPED"
> +#define ENV_SCOPED_QUIET_ACCESS_NAME "LL_SCOPED_QUIET_ACCESS"
>  #define ENV_FORCE_LOG_NAME "LL_FORCE_LOG"
>  #define ENV_UDP_BIND_NAME "LL_UDP_BIND"
>  #define ENV_UDP_CONNECT_SEND_NAME "LL_UDP_CONNECT_SEND"
> @@ -119,7 +124,7 @@ static int parse_path(char *env_path, const char ***const path_list)
>  /* clang-format on */
>  
>  static int populate_ruleset_fs(const char *const env_var, const int ruleset_fd,
> -			       const __u64 allowed_access)
> +			       const __u64 allowed_access, __u32 flags)
>  {
>  	int num_paths, i, ret = 1;
>  	char *env_path_name;
> @@ -169,7 +174,7 @@ static int populate_ruleset_fs(const char *const env_var, const int ruleset_fd,
>  		if (!S_ISDIR(statbuf.st_mode))
>  			path_beneath.allowed_access &= ACCESS_FILE;
>  		if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
> -				      &path_beneath, 0)) {
> +				      &path_beneath, flags)) {
>  			fprintf(stderr,
>  				"Failed to update the ruleset with \"%s\": %s\n",
>  				path_list[i], strerror(errno));
> @@ -187,7 +192,7 @@ static int populate_ruleset_fs(const char *const env_var, const int ruleset_fd,
>  }
>  
>  static int populate_ruleset_net(const char *const env_var, const int ruleset_fd,
> -				const __u64 allowed_access)
> +				const __u64 allowed_access, __u32 flags)
>  {
>  	int ret = 1;
>  	char *env_port_name, *env_port_name_next, *strport;
> @@ -215,7 +220,7 @@ static int populate_ruleset_net(const char *const env_var, const int ruleset_fd,
>  		}
>  		net_port.port = port;
>  		if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
> -				      &net_port, 0)) {
> +				      &net_port, flags)) {
>  			fprintf(stderr,
>  				"Failed to update the ruleset with port \"%llu\": %s\n",
>  				net_port.port, strerror(errno));
> @@ -303,6 +308,58 @@ static bool check_ruleset_scope(const char *const env_var,
>  
>  /* clang-format on */
>  
> +static int add_quiet_access(__u64 *const quiet_access,
> +			    const __u64 handled_access,
> +			    const char *const env_var, const bool default_all)
> +{
> +	char *env_quiet_access, *env_quiet_access_next, *str_access;
> +
> +	if (default_all)
> +		*quiet_access = handled_access;
> +	else
> +		*quiet_access = 0;
> +
> +	env_quiet_access = getenv(env_var);
> +	if (!env_quiet_access)
> +		return 0;
> +
> +	env_quiet_access = strdup(env_quiet_access);
> +	env_quiet_access_next = env_quiet_access;
> +	unsetenv(env_var);
> +	*quiet_access = 0;
> +
> +	while ((str_access = strsep(&env_quiet_access_next, ENV_DELIMITER))) {
> +		if (strcmp(str_access, "") == 0)
> +			continue;
> +		else if (strcmp(str_access, "r") == 0)
> +			*quiet_access |= ACCESS_FS_ROUGHLY_READ;
> +		else if (strcmp(str_access, "w") == 0)
> +			*quiet_access |= ACCESS_FS_ROUGHLY_WRITE;
> +		else if (strcmp(str_access, "b") == 0)
> +			*quiet_access |= LANDLOCK_ACCESS_NET_BIND_TCP;
> +		else if (strcmp(str_access, "c") == 0)
> +			*quiet_access |= LANDLOCK_ACCESS_NET_CONNECT_TCP;
> +		else if (strcmp(str_access, "ub") == 0)
> +			*quiet_access |= LANDLOCK_ACCESS_NET_BIND_UDP;
> +		else if (strcmp(str_access, "uc") == 0)
> +			*quiet_access |= LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP;
> +		else if (strcmp(str_access, "a") == 0)
> +			*quiet_access |= LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET;
> +		else if (strcmp(str_access, "s") == 0)
> +			*quiet_access |= LANDLOCK_SCOPE_SIGNAL;
You don't need to do it in this patch but these strings should probably
be centrally defined somewhere... as we add more they could be easy to
mix up.
> +		else {
> +			fprintf(stderr, "Unknown quiet access \"%s\"\n",
> +				str_access);
> +			free(env_quiet_access);
> +			return -1;
> +		}
> +	}
> +
> +	free(env_quiet_access);
> +	*quiet_access &= handled_access;
> +	return 0;
> +}
> +
>  #define LANDLOCK_ABI_LAST 10
>  
>  #define XSTR(s) #s
> @@ -336,6 +393,22 @@ static const char help[] =
>  	"\n"
>  	"A sandboxer should not log denied access requests to avoid spamming logs, "
>  	"but to test audit we can set " ENV_FORCE_LOG_NAME "=1\n"
> +	ENV_FS_QUIET_NAME " and " ENV_NET_QUIET_NAME ", both optional, can then be used "
> +	"to make access to some denied paths or network ports not trigger audit logging.\n"
> +	ENV_FS_QUIET_ACCESS_NAME " and " ENV_NET_QUIET_ACCESS_NAME " can be used to specify "
> +	"which accesses should be quieted (defaults to all):\n"
> +	"* " ENV_FS_QUIET_ACCESS_NAME ": file system accesses to quiet\n"
> +	"  - \"r\" to quiet all file/dir read accesses\n"
> +	"  - \"w\" to quiet all file/dir write accesses\n"
> +	"* " ENV_NET_QUIET_ACCESS_NAME ": network accesses to quiet\n"
> +	"  - \"b\" to quiet tcp bind denials\n"
> +	"  - \"c\" to quiet tcp connect denials\n"
> +	"  - \"ub\" to quiet udp bind denials\n"
> +	"  - \"uc\" to quiet udp connect / send denials\n"
> +	"In addition, " ENV_SCOPED_QUIET_ACCESS_NAME " can be set to quiet all denials for "
> +	"scoped actions (defaults to none).\n"
> +	"  - \"a\" to quiet abstract unix socket denials\n"
> +	"  - \"s\" to quiet signal denials\n"
>  	"\n"
>  	"Example:\n"
>  	ENV_FS_RO_NAME "=\"${PATH}:/lib:/usr:/proc:/etc:/dev/urandom\" "
> @@ -368,7 +441,12 @@ int main(const int argc, char *const argv[], char *const *const envp)
>  				      LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
>  		.scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
>  			  LANDLOCK_SCOPE_SIGNAL,
> +		.quiet_access_fs = 0,
> +		.quiet_access_net = 0,
> +		.quiet_scoped = 0,
>  	};
> +
> +	bool quiet_supported = true;
>  	int supported_restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON;
>  	int set_restrict_flags = 0;
>  
> @@ -459,6 +537,9 @@ int main(const int argc, char *const argv[], char *const *const envp)
>  		ruleset_attr.handled_access_net &=
>  			~(LANDLOCK_ACCESS_NET_BIND_UDP |
>  			  LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
> +		__attribute__((fallthrough));
The fallthrough should be the last statement in the switch case;
otherwise this causes a build warning.
> +		/* Don't add quiet flags for ABI < 10 later on. */
> +		quiet_supported = false;
>  
>  		/* Must be printed for any ABI < LANDLOCK_ABI_LAST. */
>  		fprintf(stderr,
> @@ -525,6 +606,25 @@ int main(const int argc, char *const argv[], char *const *const envp)
>  		unsetenv(ENV_FORCE_LOG_NAME);
>  	}
>  
> +	/*
> +	 * Add quiet for fs/net handled access bits.  Doing this alone has no
> +	 * effect unless we later add quiet rules per FS_QUIET/NET_QUIET.
> +	 */
> +	if (quiet_supported) {
> +		if (add_quiet_access(&ruleset_attr.quiet_access_fs,
> +				     ruleset_attr.handled_access_fs,
> +				     ENV_FS_QUIET_ACCESS_NAME, true))
> +			return 1;
> +		if (add_quiet_access(&ruleset_attr.quiet_access_net,
> +				     ruleset_attr.handled_access_net,
> +				     ENV_NET_QUIET_ACCESS_NAME, true))
> +			return 1;
> +		if (add_quiet_access(&ruleset_attr.quiet_scoped,
> +				     ruleset_attr.scoped,
> +				     ENV_SCOPED_QUIET_ACCESS_NAME, false))
> +			return 1;
> +	}
> +
>  	ruleset_fd =
>  		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
>  	if (ruleset_fd < 0) {
> @@ -532,30 +632,42 @@ int main(const int argc, char *const argv[], char *const *const envp)
>  		return 1;
>  	}
>  
> -	if (populate_ruleset_fs(ENV_FS_RO_NAME, ruleset_fd, access_fs_ro)) {
> +	if (populate_ruleset_fs(ENV_FS_RO_NAME, ruleset_fd, access_fs_ro, 0))
>  		goto err_close_ruleset;
> -	}
> -	if (populate_ruleset_fs(ENV_FS_RW_NAME, ruleset_fd, access_fs_rw)) {
> +	if (populate_ruleset_fs(ENV_FS_RW_NAME, ruleset_fd, access_fs_rw, 0))
>  		goto err_close_ruleset;
> +
> +	/* Don't require this env to be present. */
> +	if (quiet_supported && getenv(ENV_FS_QUIET_NAME)) {
> +		if (populate_ruleset_fs(ENV_FS_QUIET_NAME, ruleset_fd, 0,
> +					LANDLOCK_ADD_RULE_QUIET))
> +			goto err_close_ruleset;
>  	}
>  
>  	if (populate_ruleset_net(ENV_TCP_BIND_NAME, ruleset_fd,
> -				 LANDLOCK_ACCESS_NET_BIND_TCP)) {
> +				 LANDLOCK_ACCESS_NET_BIND_TCP, 0)) {
>  		goto err_close_ruleset;
>  	}
>  	if (populate_ruleset_net(ENV_TCP_CONNECT_NAME, ruleset_fd,
> -				 LANDLOCK_ACCESS_NET_CONNECT_TCP)) {
> +				 LANDLOCK_ACCESS_NET_CONNECT_TCP, 0)) {
>  		goto err_close_ruleset;
>  	}
>  	if (populate_ruleset_net(ENV_UDP_BIND_NAME, ruleset_fd,
> -				 LANDLOCK_ACCESS_NET_BIND_UDP)) {
> +				 LANDLOCK_ACCESS_NET_BIND_UDP, 0)) {
>  		goto err_close_ruleset;
>  	}
>  	if (populate_ruleset_net(ENV_UDP_CONNECT_SEND_NAME, ruleset_fd,
> -				 LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP)) {
> +				 LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP, 0)) {
>  		goto err_close_ruleset;
>  	}
>  
> +	if (quiet_supported) {
> +		if (populate_ruleset_net(ENV_NET_QUIET_NAME, ruleset_fd, 0,
> +					 LANDLOCK_ADD_RULE_QUIET)) {
> +			goto err_close_ruleset;
> +		}
> +	}
> +
>  	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
>  		perror("Failed to restrict privileges");
>  		goto err_close_ruleset;
> -- 
> 2.54.0

^ permalink raw reply

* Re: [PATCH v8 03/10] landlock: Use landlock_walk_path_up() in collect_domain_accesses()
From: Justin Suess @ 2026-05-29  2:24 UTC (permalink / raw)
  To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module
In-Reply-To: <20260529015210.500291-4-utilityemal77@gmail.com>

On Thu, May 28, 2026 at 09:52:02PM -0400, Justin Suess wrote:
> Replace the open-coded loop with landlock_walk_path_up() and change the
> function signature from (mnt_root, dir) to a single struct path.  The
> caller's mount point and starting dentry are now both carried in @path,
> which keeps the traversal logic consistent with
> is_access_to_paths_allowed().
> 
> No functional change intended.
> 
> Signed-off-by: Justin Suess <utilityemal77@gmail.com>
> ---
> 
> Notes:
>     v7..v8 changes:
>     
>       * Reworded commit message.
>       * Changed collect_domain_accesses() to take a single struct path *
>         instead of separate mnt_root/dir parameters, simplifying the
>         interface and matching is_access_to_paths_allowed().
>       * Tightened the disconnected-directory stop condition to require
>         !d_unhashed(walker_path.dentry) when comparing against the mount
Disregard this note about d_unhashed this note ended up being
an uneccesary leftover from some attempted bugfixes in my draft (now addressed/fixed)
and I forgot to remove bullet this the git notes before sending.

Justin
> [...]

^ permalink raw reply

* [PATCH v8 10/10] landlock: Add KUnit tests for LANDLOCK_ADD_RULE_NO_INHERIT
From: Justin Suess @ 2026-05-29  1:52 UTC (permalink / raw)
  To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>

Add the landlock_ruleset KUnit suite with five tests for the
no_inherit handling in landlock_unmask_layers():

- test_unmask_no_inherit_propagates: a rule with no_inherit unmasks
  access and sets the no_inherit bit on the layer mask.
- test_unmask_no_inherit_skip: a layer with no_inherit already set in
  the mask is skipped (no access removal).
- test_unmask_no_inherit_both_set: when both rule and mask have
  no_inherit, the skip still happens and the bit stays set.
- test_unmask_multilayer_no_inherit: no_inherit on one layer of a
  multi-layer rule only affects that layer.
- test_unmask_no_inherit_sequential: applying a descendant rule
  (no_inherit) followed by an ancestor rule causes the ancestor to be
  skipped, modeling a path walk.

Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---

Notes:
    v7..v8 changes:
    
      * Renamed patch from 'Implement KUnit test' to 'Add KUnit tests'
        (now plural).
      * Replaced the single test_unmask_layers_no_inherit() case with
        five focused tests aligned with the new per-layer no_inherit
        bit added to struct layer_mask in patch 6:
    
        - test_unmask_no_inherit_propagates
        - test_unmask_no_inherit_skip
        - test_unmask_no_inherit_both_set
        - test_unmask_multilayer_no_inherit
        - test_unmask_no_inherit_sequential
    
      * Added alloc_rule() and fill_masks() helpers to share setup
        across the new tests.

 security/landlock/ruleset.c | 182 ++++++++++++++++++++++++++++++++++++
 1 file changed, 182 insertions(+)

diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index c78e2b2d73ff..9f47d106aca3 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -6,6 +6,7 @@
  * Copyright © 2018-2020 ANSSI
  */
 
+#include <kunit/test.h>
 #include <linux/bits.h>
 #include <linux/bug.h>
 #include <linux/cleanup.h>
@@ -766,3 +767,184 @@ landlock_init_layer_masks(const struct landlock_ruleset *const domain,
 
 	return handled_accesses;
 }
+
+#ifdef CONFIG_SECURITY_LANDLOCK_KUNIT_TEST
+
+/*
+ * Helper to allocate a rule with @num_layers layers and initialize
+ * its num_layers field.  Caller must fill in individual layers.
+ */
+static struct landlock_rule *alloc_rule(struct kunit *test, u32 num_layers)
+{
+	struct landlock_rule *rule;
+
+	rule = kzalloc(struct_size(rule, layers, num_layers), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, rule);
+	rule->num_layers = num_layers;
+	return rule;
+}
+
+/*
+ * Build a layer_masks with the first @num_layers layers' access set to
+ * @val, and all no_inherit flags cleared.  Layers beyond @num_layers stay
+ * zeroed, matching what landlock_init_layer_masks() produces for a domain
+ * with that many layers.
+ */
+static void fill_masks(struct layer_masks *masks, access_mask_t val,
+		       size_t num_layers)
+{
+	memset(masks, 0, sizeof(*masks));
+	for (size_t i = 0; i < num_layers; i++)
+		masks->layers[i].access = val;
+}
+
+/* Verify that a rule with no_inherit unmasks access and propagates the flag. */
+static void test_unmask_no_inherit_propagates(struct kunit *const test)
+{
+	struct landlock_rule *rule = alloc_rule(test, 1);
+	struct layer_masks masks;
+	const access_mask_t req = BIT_ULL(0) | BIT_ULL(1);
+
+	rule->layers[0].level = 1;
+	rule->layers[0].access = BIT_ULL(0);
+	rule->layers[0].flags.no_inherit = true;
+
+	fill_masks(&masks, req, 1);
+	landlock_unmask_layers(rule, &masks);
+
+	/* access bit 0 should be cleared, bit 1 remains */
+	KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access,
+			BIT_ULL(1));
+	KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+	KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[1].access, 0);
+	kfree(rule);
+}
+
+/* Verify that a pre-set no_inherit in the mask causes the layer to be skipped. */
+static void test_unmask_no_inherit_skip(struct kunit *const test)
+{
+	struct landlock_rule *rule = alloc_rule(test, 1);
+	struct layer_masks masks;
+	const access_mask_t req = BIT_ULL(0);
+
+	rule->layers[0].level = 1;
+	rule->layers[0].access = BIT_ULL(0);
+
+	fill_masks(&masks, req, 1);
+	masks.layers[0].no_inherit = true;
+	landlock_unmask_layers(rule, &masks);
+
+	/* bit 0 should NOT be cleared because layer was skipped */
+	KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access, req);
+	KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+	kfree(rule);
+}
+
+/*
+ * Verify that no_inherit on the rule is still set when the mask already
+ * has no_inherit (the skip prevents access removal but the flag propagates).
+ */
+static void test_unmask_no_inherit_both_set(struct kunit *const test)
+{
+	struct landlock_rule *rule = alloc_rule(test, 1);
+	struct layer_masks masks;
+	const access_mask_t req = BIT_ULL(0);
+
+	rule->layers[0].level = 1;
+	rule->layers[0].access = BIT_ULL(0);
+	rule->layers[0].flags.no_inherit = true;
+
+	fill_masks(&masks, req, 1);
+	masks.layers[0].no_inherit = true;
+	landlock_unmask_layers(rule, &masks);
+
+	KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access, req);
+	KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+	kfree(rule);
+}
+
+/*
+ * Verify that no_inherit on layer 1 of a multi-layer rule only affects
+ * layer 1; layer 2 still contributes normally.
+ */
+static void test_unmask_multilayer_no_inherit(struct kunit *const test)
+{
+	struct landlock_rule *rule = alloc_rule(test, 2);
+	struct layer_masks masks;
+	const access_mask_t req = BIT_ULL(0) | BIT_ULL(1);
+
+	rule->layers[0].level = 1;
+	rule->layers[0].access = BIT_ULL(0);
+	rule->layers[0].flags.no_inherit = true;
+
+	rule->layers[1].level = 2;
+	rule->layers[1].access = BIT_ULL(1);
+
+	fill_masks(&masks, req, 2);
+	landlock_unmask_layers(rule, &masks);
+
+	/* Layer 1: bit 0 cleared, no_inherit set */
+	KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access, BIT_ULL(1));
+	KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+
+	/* Layer 2: bit 1 cleared, no_inherit not set */
+	KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[1].access, BIT_ULL(0));
+	KUNIT_EXPECT_FALSE(test, masks.layers[1].no_inherit);
+	kfree(rule);
+}
+
+/*
+ * Verify that when applying two rules sequentially (as happens during
+ * a path walk), no_inherit from the first rule prevents the second
+ * rule from contributing to that layer.
+ */
+static void test_unmask_no_inherit_sequential(struct kunit *const test)
+{
+	struct landlock_rule *rule1 = alloc_rule(test, 1);
+	struct landlock_rule *rule2 = alloc_rule(test, 1);
+	struct layer_masks masks;
+	const access_mask_t req = BIT_ULL(0) | BIT_ULL(1);
+
+	/* Rule 1: no_inherit on layer 1, grants access bit 0 */
+	rule1->layers[0].level = 1;
+	rule1->layers[0].access = BIT_ULL(0);
+	rule1->layers[0].flags.no_inherit = true;
+
+	/* Rule 2: also on layer 1, grants access bit 1 (ancestor rule) */
+	rule2->layers[0].level = 1;
+	rule2->layers[0].access = BIT_ULL(1);
+
+	/* Apply rule1 first (descendant), then rule2 (ancestor) */
+	fill_masks(&masks, req, 1);
+	landlock_unmask_layers(rule1, &masks);
+	landlock_unmask_layers(rule2, &masks);
+
+	/*
+	 * Rule2 should be skipped because rule1 set no_inherit.
+	 * bit 0 cleared by rule1, bit 1 remains because rule2 skipped.
+	 */
+	KUNIT_EXPECT_EQ(test, (access_mask_t)masks.layers[0].access, BIT_ULL(1));
+	KUNIT_EXPECT_TRUE(test, masks.layers[0].no_inherit);
+	kfree(rule1);
+	kfree(rule2);
+}
+
+/* clang-format off */
+static struct kunit_case test_cases[] = {
+	KUNIT_CASE(test_unmask_no_inherit_propagates),
+	KUNIT_CASE(test_unmask_no_inherit_skip),
+	KUNIT_CASE(test_unmask_no_inherit_both_set),
+	KUNIT_CASE(test_unmask_multilayer_no_inherit),
+	KUNIT_CASE(test_unmask_no_inherit_sequential),
+	{}
+};
+/* clang-format on */
+
+static struct kunit_suite test_suite = {
+	.name = "landlock_ruleset",
+	.test_cases = test_cases,
+};
+
+kunit_test_suite(test_suite);
+
+#endif /* CONFIG_SECURITY_LANDLOCK_KUNIT_TEST */
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 09/10] selftests/landlock: Add selftests for LANDLOCK_ADD_RULE_NO_INHERIT
From: Justin Suess @ 2026-05-29  1:52 UTC (permalink / raw)
  To: gnoack3000, mic; +Cc: linux-kernel, linux-security-module, Justin Suess
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>

Add test coverage for the new flag:

- New layout1_no_inherit fixture with five variants covering NO_INHERIT
  on leaf, middle, and root directories, RW-over-RO expansion, and a
  regular file target.  Three tests per variant exercise inheritance
  blocking, topology sealing, and layered (multi-domain) NO_INHERIT.

- A new layout4_disconnected_leafs variant exercising NO_INHERIT applied
  through a bind mount, asserting that ancestors in both the bind and
  source paths are sealed.

- A new audit_no_inherit fixture verifying that the flag interacts
  correctly with the quiet flag: a quiet ancestor does not suppress
  audit on a descendant that has crossed a NO_INHERIT boundary.

Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---

Notes:
    v7..v8 changes:
    
      * Reorganized the new fs_test.c coverage around fixtures and
        variants instead of one TEST_F_FORK per scenario:
    
        - New layout1_no_inherit fixture with five FIXTURE_VARIANT_ADD
          cases (rw_parent_ro_leaf, rw_parent_ro_middle, rw_parent_ro_root,
          ro_parent_rw_middle, rw_parent_read_file) collapse what were
          eight near-duplicate layout1 tests in v7 into three shared
          tests (blocks_inheritance, seals_topology, layered_no_inherit).
    
        - New layout4_disconnected_leafs variant 'no_inherit_mount' with a
          single 'no_inherit_seals_mount' test replaces the four
          v7 layout4 tests (no_inherit_mount_parent_{rename,rmdir,link}
          and no_inherit_source_parent_rename) by exercising all four
          sealed topology operations in one test.
    
        - New audit_no_inherit fixture with three variants
          (parent_is_logged, blocks_quiet_inheritance, quiet_parent)
          covers the quiet/no_inherit interaction previously inlined
          into an ad hoc audit test.
    
      * Net change: 705 added lines in v7 -> 419 added lines in v8, with
        equivalent coverage.

 tools/testing/selftests/landlock/fs_test.c | 419 +++++++++++++++++++++
 1 file changed, 419 insertions(+)

diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 2e32295258f9..625ff1afecb0 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -1429,6 +1429,224 @@ TEST_F_FORK(layout1, inherit_superset)
 	ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
 }
 
+FIXTURE(layout1_no_inherit) {};
+
+FIXTURE_SETUP(layout1_no_inherit)
+{
+	prepare_layout(_metadata);
+	create_layout1(_metadata);
+}
+
+FIXTURE_TEARDOWN_PARENT(layout1_no_inherit)
+{
+	remove_layout1(_metadata);
+	cleanup_layout(_metadata);
+}
+
+FIXTURE_VARIANT(layout1_no_inherit)
+{
+	const char *ni_path;
+	const __u64 ni_access;
+	const char *ni_file;
+	const char *desc_file;
+	const int expected_ni_write;
+	const int expected_ni_read;
+	const int expected_desc_write;
+	const int expected_desc_read;
+};
+
+/* NO_INHERIT on leaf directory: blocks parent's RW, grants only RO. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, rw_parent_ro_leaf) {
+	/* clang-format on */
+	.ni_path = TMP_DIR "/s1d1/s1d2/s1d3",
+	.ni_access = ACCESS_RO,
+	.ni_file = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+	.desc_file = TMP_DIR "/s1d1/s1d2/s1d3/f2",
+	.expected_ni_write = EACCES,
+	.expected_ni_read = 0,
+	.expected_desc_write = EACCES,
+	.expected_desc_read = 0,
+};
+
+/* NO_INHERIT on middle directory: blocks parent's RW for all descendants. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, rw_parent_ro_middle) {
+	/* clang-format on */
+	.ni_path = TMP_DIR "/s1d1/s1d2",
+	.ni_access = ACCESS_RO,
+	.ni_file = TMP_DIR "/s1d1/s1d2/f1",
+	.desc_file = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+	.expected_ni_write = EACCES,
+	.expected_ni_read = 0,
+	.expected_desc_write = EACCES,
+	.expected_desc_read = 0,
+};
+
+/* NO_INHERIT on root directory: blocks parent's RW for entire subtree. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, rw_parent_ro_root) {
+	/* clang-format on */
+	.ni_path = TMP_DIR "/s1d1",
+	.ni_access = ACCESS_RO,
+	.ni_file = TMP_DIR "/s1d1/f1",
+	.desc_file = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+	.expected_ni_write = EACCES,
+	.expected_ni_read = 0,
+	.expected_desc_write = EACCES,
+	.expected_desc_read = 0,
+};
+
+/* NO_INHERIT with RW access expands parent's RO to RW. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, ro_parent_rw_middle) {
+	/* clang-format on */
+	.ni_path = TMP_DIR "/s1d1/s1d2",
+	.ni_access = ACCESS_RW,
+	.ni_file = TMP_DIR "/s1d1/s1d2/f1",
+	.desc_file = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+	.expected_ni_write = 0,
+	.expected_ni_read = 0,
+	.expected_desc_write = 0,
+	.expected_desc_read = 0,
+};
+
+/* NO_INHERIT on a file: file gets only its explicit READ_FILE access. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout1_no_inherit, rw_parent_read_file) {
+	/* clang-format on */
+	.ni_path = TMP_DIR "/s1d1/s1d2/f1",
+	.ni_access = LANDLOCK_ACCESS_FS_READ_FILE,
+	.ni_file = TMP_DIR "/s1d1/s1d2/f1",
+	.desc_file = TMP_DIR "/s1d1/s1d2/f2",
+	.expected_ni_write = EACCES,
+	.expected_ni_read = 0,
+	.expected_desc_write = 0,
+	.expected_desc_read = 0,
+};
+
+TEST_F_FORK(layout1_no_inherit, blocks_inheritance)
+{
+	struct landlock_ruleset_attr ruleset_attr = {
+		.handled_access_fs = ACCESS_RW,
+	};
+	int ruleset_fd;
+
+	/* RO variants: TMP_DIR gets RO instead of RW. */
+	if (variant->ni_access == ACCESS_RW)
+		ruleset_attr.handled_access_fs |= LANDLOCK_ACCESS_FS_READ_DIR;
+
+	ruleset_fd =
+		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+	ASSERT_LE(0, ruleset_fd);
+
+	if (variant->ni_access == ACCESS_RW)
+		add_path_beneath(_metadata, ruleset_fd, ACCESS_RO, TMP_DIR, 0);
+	else
+		add_path_beneath(_metadata, ruleset_fd, ACCESS_RW, TMP_DIR, 0);
+
+	add_path_beneath(_metadata, ruleset_fd, variant->ni_access,
+			 variant->ni_path, LANDLOCK_ADD_RULE_NO_INHERIT);
+
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	EXPECT_EQ(variant->expected_ni_write,
+		  test_open(variant->ni_file, O_WRONLY));
+	EXPECT_EQ(variant->expected_ni_read,
+		  test_open(variant->ni_file, O_RDONLY));
+
+	if (variant->desc_file != variant->ni_file) {
+		EXPECT_EQ(variant->expected_desc_write,
+			  test_open(variant->desc_file, O_WRONLY));
+		EXPECT_EQ(variant->expected_desc_read,
+			  test_open(variant->desc_file, O_RDONLY));
+	}
+}
+
+TEST_F_FORK(layout1_no_inherit, seals_topology)
+{
+	int ruleset_fd;
+	struct landlock_ruleset_attr ruleset_attr = {
+		.handled_access_fs = ACCESS_RW | LANDLOCK_ACCESS_FS_REFER |
+				     LANDLOCK_ACCESS_FS_REMOVE_FILE |
+				     LANDLOCK_ACCESS_FS_REMOVE_DIR,
+	};
+
+	ruleset_fd =
+		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+	ASSERT_LE(0, ruleset_fd);
+
+	add_path_beneath(_metadata, ruleset_fd,
+			 ACCESS_RW | LANDLOCK_ACCESS_FS_REFER |
+				 LANDLOCK_ACCESS_FS_REMOVE_FILE |
+				 LANDLOCK_ACCESS_FS_REMOVE_DIR,
+			 TMP_DIR, 0);
+	add_path_beneath(_metadata, ruleset_fd, variant->ni_access,
+			 variant->ni_path, LANDLOCK_ADD_RULE_NO_INHERIT);
+
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	/* The directory bearing NO_INHERIT cannot be renamed or removed. */
+	ASSERT_EQ(-1, rename(variant->ni_path, TMP_DIR "/ni_renamed"));
+	ASSERT_EQ(EACCES, errno);
+
+	/*
+	 * Content inside the NO_INHERIT directory is still mutable
+	 * (if the access rights permit it).
+	 */
+	if (variant->ni_access & LANDLOCK_ACCESS_FS_REMOVE_FILE) {
+		ASSERT_EQ(0, unlink(variant->ni_file));
+	} else {
+		ASSERT_EQ(-1, unlink(variant->ni_file));
+	}
+
+	/* Unrelated operations outside the sealed branch still work. */
+	ASSERT_EQ(0, unlink(file1_s2d1));
+	ASSERT_EQ(0, mknod(file1_s2d1, S_IFREG | 0700, 0));
+}
+
+TEST_F_FORK(layout1_no_inherit, layered_no_inherit)
+{
+	const struct rule layer_rules[] = {
+		{
+			.path = TMP_DIR,
+			.access = ACCESS_RW | LANDLOCK_ACCESS_FS_REMOVE_FILE,
+		},
+		{},
+	};
+	int ruleset_fd;
+
+	/* Layer 1: RW on TMP_DIR. */
+	ruleset_fd = create_ruleset(_metadata,
+				    ACCESS_RW | LANDLOCK_ACCESS_FS_REMOVE_FILE,
+				    layer_rules);
+	ASSERT_LE(0, ruleset_fd);
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	/* Layer 2: NO_INHERIT on the target. */
+	ruleset_fd = create_ruleset(_metadata,
+				    ACCESS_RW | LANDLOCK_ACCESS_FS_REMOVE_FILE,
+				    layer_rules);
+	ASSERT_LE(0, ruleset_fd);
+	add_path_beneath(_metadata, ruleset_fd, variant->ni_access,
+			 variant->ni_path, LANDLOCK_ADD_RULE_NO_INHERIT);
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	/* The target path cannot be renamed. */
+	ASSERT_EQ(-1, rename(variant->ni_path, TMP_DIR "/ni_renamed_layered"));
+	ASSERT_EQ(EACCES, errno);
+
+	/* Content at NI path respects the NO_INHERIT access from layer 2. */
+	EXPECT_EQ(variant->expected_ni_write,
+		  test_open(variant->ni_file, O_WRONLY));
+	EXPECT_EQ(variant->expected_ni_read,
+		  test_open(variant->ni_file, O_RDONLY));
+}
+
 TEST_F_FORK(layout0, max_layers)
 {
 	int i, err;
@@ -5571,6 +5789,25 @@ FIXTURE_VARIANT(layout4_disconnected_leafs)
 	const int expected_exchange_result;
 	/* Expected result of the call to renameat([fd:s1d42]/f4, [fd:s1d42]/f5). */
 	const int expected_same_dir_rename_result;
+
+	/*
+	 * If true, a NO_INHERIT rule is set on s1d41 (via the bind mount
+	 * at s2d2).  Used by the no_inherit_mount test.
+	 */
+	bool no_inherit_on_s1d41;
+	/*
+	 * Access rights used for the optional NO_INHERIT rule on s1d41.
+	 */
+	const __u64 no_inherit_access;
+	/*
+	 * Expected result of renaming s1d31 (parent of s1d41 within the
+	 * mount) when no_inherit_on_s1d41 is set.
+	 */
+	const int expected_parent_rename;
+	/*
+	 * Expected result of rmdir on s1d31, when no_inherit_on_s1d41 is set.
+	 */
+	const int expected_parent_rmdir;
 };
 
 /* clang-format off */
@@ -5823,6 +6060,26 @@ FIXTURE_VARIANT_ADD(layout4_disconnected_leafs, f1_f2_f3) {
 	.expected_exchange_result = EACCES,
 };
 
+/*
+ * NO_INHERIT variant: s1d41 is protected with ACCESS_RO via the bind mount.
+ * Parents within the mount are sealed against topology changes.
+ */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected_leafs, no_inherit_mount) {
+	/* clang-format on */
+	.allowed_f1 = LANDLOCK_ACCESS_FS_READ_FILE,
+	.allowed_f2 = LANDLOCK_ACCESS_FS_READ_FILE,
+	.allowed_f3 = LANDLOCK_ACCESS_FS_READ_FILE,
+	.expected_read_result = 0,
+	.expected_rename_result = EACCES,
+	.expected_exchange_result = EACCES,
+	.expected_same_dir_rename_result = EACCES,
+	.no_inherit_on_s1d41 = true,
+	.no_inherit_access = ACCESS_RO,
+	.expected_parent_rename = EACCES,
+	.expected_parent_rmdir = EACCES,
+};
+
 TEST_F_FORK(layout4_disconnected_leafs, read_rename_exchange)
 {
 	const __u64 handled_access =
@@ -5931,6 +6188,70 @@ TEST_F_FORK(layout4_disconnected_leafs, read_rename_exchange)
 		  test_renameat(s1d42_bind_fd, "f4", s1d42_bind_fd, "f5"));
 }
 
+/*
+ * When s1d41 (accessed via the bind mount at s2d2) is protected with
+ * NO_INHERIT, its parent directories within the mount are sealed from
+ * topology changes.  Other variants do not exercise NO_INHERIT and skip
+ * this test.
+ */
+TEST_F_FORK(layout4_disconnected_leafs, no_inherit_seals_mount)
+{
+	struct landlock_ruleset_attr ruleset_attr = {
+		.handled_access_fs = ACCESS_RW | LANDLOCK_ACCESS_FS_REFER |
+				     LANDLOCK_ACCESS_FS_REMOVE_FILE |
+				     LANDLOCK_ACCESS_FS_REMOVE_DIR,
+	};
+	int ruleset_fd, s1d41_bind_fd;
+
+	if (!variant->no_inherit_on_s1d41)
+		SKIP(return, "variant does not set NO_INHERIT on s1d41");
+
+	ruleset_fd =
+		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+	ASSERT_LE(0, ruleset_fd);
+
+	add_path_beneath(_metadata, ruleset_fd,
+			 ACCESS_RW | LANDLOCK_ACCESS_FS_REFER |
+				 LANDLOCK_ACCESS_FS_REMOVE_FILE |
+				 LANDLOCK_ACCESS_FS_REMOVE_DIR,
+			 TMP_DIR, 0);
+
+	s1d41_bind_fd = open(TMP_DIR "/s2d1/s2d2/s1d31/s1d41",
+			     O_DIRECTORY | O_PATH | O_CLOEXEC);
+	ASSERT_LE(0, s1d41_bind_fd);
+
+	ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+				       &(struct landlock_path_beneath_attr){
+					       .parent_fd = s1d41_bind_fd,
+					       .allowed_access =
+						       variant->no_inherit_access,
+				       },
+				       LANDLOCK_ADD_RULE_NO_INHERIT));
+	EXPECT_EQ(0, close(s1d41_bind_fd));
+
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	/* Parent of s1d41 within the mount is sealed. */
+	ASSERT_EQ(-1, rmdir(TMP_DIR "/s2d1/s2d2/s1d31"));
+	ASSERT_EQ(variant->expected_parent_rmdir, errno);
+
+	ASSERT_EQ(-1, rename(TMP_DIR "/s2d1/s2d2/s1d31",
+			     TMP_DIR "/s2d1/s2d2/s1d31_renamed"));
+	ASSERT_EQ(variant->expected_parent_rename, errno);
+
+	/* Sibling directories outside the sealed chain are free. */
+	ASSERT_EQ(0, rename(TMP_DIR "/s2d1/s2d2/s1d32",
+			    TMP_DIR "/s2d1/s2d2/s1d32_renamed"));
+	ASSERT_EQ(0, rename(TMP_DIR "/s2d1/s2d2/s1d32_renamed",
+			    TMP_DIR "/s2d1/s2d2/s1d32"));
+
+	/* The mount source parent hierarchy is also sealed. */
+	ASSERT_EQ(-1, rename(TMP_DIR "/s1d1/s1d2/s1d31",
+			     TMP_DIR "/s1d1/s1d2/s1d31_renamed"));
+	ASSERT_EQ(variant->expected_parent_rename, errno);
+}
+
 /*
  * layout5_disconnected_branch before rename:
  *
@@ -7358,6 +7679,104 @@ TEST_F(audit_layout1, write_file)
 	EXPECT_EQ(1, records.domain);
 }
 
+FIXTURE(audit_no_inherit)
+{
+	struct audit_filter audit_filter;
+	int audit_fd;
+};
+
+FIXTURE_SETUP(audit_no_inherit)
+{
+	prepare_layout(_metadata);
+	create_layout1(_metadata);
+
+	set_cap(_metadata, CAP_AUDIT_CONTROL);
+	self->audit_fd = audit_init_with_exe_filter(&self->audit_filter);
+	EXPECT_LE(0, self->audit_fd);
+	clear_cap(_metadata, CAP_AUDIT_CONTROL);
+}
+
+FIXTURE_TEARDOWN_PARENT(audit_no_inherit)
+{
+	remove_layout1(_metadata);
+	cleanup_layout(_metadata);
+
+	EXPECT_EQ(0, audit_cleanup(-1, NULL));
+}
+
+FIXTURE_VARIANT(audit_no_inherit)
+{
+	bool parent_quiet;
+	const char *test_path;
+	bool expect_audit_log;
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(audit_no_inherit, parent_is_logged) {
+	/* clang-format on */
+	.parent_quiet = false,
+	.test_path = TMP_DIR "/s1d1/s1d2/f1",
+	.expect_audit_log = true,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(audit_no_inherit, blocks_quiet_inheritance) {
+	/* clang-format on */
+	.parent_quiet = true,
+	.test_path = TMP_DIR "/s1d1/s1d2/s1d3/f1",
+	.expect_audit_log = true,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(audit_no_inherit, quiet_parent) {
+	/* clang-format on */
+	.parent_quiet = true,
+	.test_path = TMP_DIR "/s1d1/f1",
+	.expect_audit_log = false,
+};
+
+TEST_F(audit_no_inherit, no_inherit_audit)
+{
+	struct audit_records records;
+	struct landlock_ruleset_attr ruleset_attr = {
+		.handled_access_fs = ACCESS_RW,
+		.quiet_access_fs = variant->parent_quiet ? ACCESS_RW : 0,
+	};
+	int ruleset_fd;
+
+	ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+					     sizeof(ruleset_attr), 0);
+	ASSERT_LE(0, ruleset_fd);
+
+	if (variant->parent_quiet)
+		add_path_beneath(_metadata, ruleset_fd, ACCESS_RO, dir_s1d1,
+				 LANDLOCK_ADD_RULE_QUIET);
+	else
+		add_path_beneath(_metadata, ruleset_fd, ACCESS_RO, dir_s1d1, 0);
+
+	add_path_beneath(_metadata, ruleset_fd, ACCESS_RO, dir_s1d3,
+			 LANDLOCK_ADD_RULE_NO_INHERIT);
+
+	enforce_ruleset(_metadata, ruleset_fd);
+
+	EXPECT_EQ(EACCES, test_open(variant->test_path, O_WRONLY));
+	if (variant->expect_audit_log) {
+		EXPECT_EQ(0, matches_log_fs(_metadata, self->audit_fd,
+					    "fs\\.write_file",
+					    variant->test_path));
+	} else {
+		EXPECT_NE(0, matches_log_fs(_metadata, self->audit_fd,
+					    "fs\\.write_file",
+					    variant->test_path));
+	}
+
+	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+	EXPECT_EQ(0, records.access);
+	EXPECT_EQ(variant->expect_audit_log ? 1 : 0, records.domain);
+
+	EXPECT_EQ(0, close(ruleset_fd));
+}
+
 TEST_F(audit_layout1, read_file)
 {
 	struct audit_records records;
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 08/10] samples/landlock: Add LANDLOCK_ADD_RULE_NO_INHERIT to landlock-sandboxer
From: Justin Suess @ 2026-05-29  1:52 UTC (permalink / raw)
  To: gnoack3000, mic
  Cc: linux-kernel, linux-security-module, Justin Suess, Tingmao Wang
In-Reply-To: <20260529015210.500291-1-utilityemal77@gmail.com>

Add a new LL_FS_NO_INHERIT environment variable to the sandboxer.
Paths listed in it are added with the LANDLOCK_ADD_RULE_NO_INHERIT
flag, demonstrating how to set up a parent directory with broader
access than its children.

The flag is silently skipped on kernels older than ABI 10.

Cc: Tingmao Wang <m@maowtm.org>
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---

Notes:
    v7..v8 changes:
    
      * Reworded commit message.
      * Updated the ABI fallthrough comment to mention 'quiet or
        no_inherit flags'. No code change beyond the comment update.

 samples/landlock/sandboxer.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
index 74ee53afed6a..b126ffd7cd4f 100644
--- a/samples/landlock/sandboxer.c
+++ b/samples/landlock/sandboxer.c
@@ -60,6 +60,7 @@ static inline int landlock_restrict_self(const int ruleset_fd,
 #define ENV_FS_RW_NAME "LL_FS_RW"
 #define ENV_FS_QUIET_NAME "LL_FS_QUIET"
 #define ENV_FS_QUIET_ACCESS_NAME "LL_FS_QUIET_ACCESS"
+#define ENV_FS_NO_INHERIT_NAME "LL_FS_NO_INHERIT"
 #define ENV_TCP_BIND_NAME "LL_TCP_BIND"
 #define ENV_TCP_CONNECT_NAME "LL_TCP_CONNECT"
 #define ENV_NET_QUIET_NAME "LL_NET_QUIET"
@@ -395,6 +396,7 @@ static const char help[] =
 	"but to test audit we can set " ENV_FORCE_LOG_NAME "=1\n"
 	ENV_FS_QUIET_NAME " and " ENV_NET_QUIET_NAME ", both optional, can then be used "
 	"to make access to some denied paths or network ports not trigger audit logging.\n"
+	ENV_FS_NO_INHERIT_NAME " can be used to suppress access right propagation (ABI >= 10).\n"
 	ENV_FS_QUIET_ACCESS_NAME " and " ENV_NET_QUIET_ACCESS_NAME " can be used to specify "
 	"which accesses should be quieted (defaults to all):\n"
 	"* " ENV_FS_QUIET_ACCESS_NAME ": file system accesses to quiet\n"
@@ -447,6 +449,7 @@ int main(const int argc, char *const argv[], char *const *const envp)
 	};
 
 	bool quiet_supported = true;
+	bool no_inherit_supported = true;
 	int supported_restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON;
 	int set_restrict_flags = 0;
 
@@ -538,8 +541,9 @@ int main(const int argc, char *const argv[], char *const *const envp)
 			~(LANDLOCK_ACCESS_NET_BIND_UDP |
 			  LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
 		__attribute__((fallthrough));
-		/* Don't add quiet flags for ABI < 10 later on. */
+		/* Don't add quiet or no_inherit flags for ABI < 10 later on. */
 		quiet_supported = false;
+		no_inherit_supported = false;
 
 		/* Must be printed for any ABI < LANDLOCK_ABI_LAST. */
 		fprintf(stderr,
@@ -644,6 +648,13 @@ int main(const int argc, char *const argv[], char *const *const envp)
 			goto err_close_ruleset;
 	}
 
+	/* Don't require this env to be present. */
+	if (no_inherit_supported && getenv(ENV_FS_NO_INHERIT_NAME)) {
+		if (populate_ruleset_fs(ENV_FS_NO_INHERIT_NAME, ruleset_fd, 0,
+					LANDLOCK_ADD_RULE_NO_INHERIT))
+			goto err_close_ruleset;
+	}
+
 	if (populate_ruleset_net(ENV_TCP_BIND_NAME, ruleset_fd,
 				 LANDLOCK_ACCESS_NET_BIND_TCP, 0)) {
 		goto err_close_ruleset;
-- 
2.53.0


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox