Linux Security Modules development
 help / color / mirror / Atom feed
* [PATCH] LSM: check if lsmprop_to_secctx call is supported by LSM
From: Sebastian Bockholt @ 2026-06-19 17:19 UTC (permalink / raw)
  To: linux-security-module; +Cc: serge, jmorris, paul, Sebastian Bockholt

In include/linux/lsm_hook_defs.h, lsmprop_to_secctx is defined with
a default return value of -EOPNOTSUPP.
The function bpf_lsm_lsmprop_to_secctx, defined in
security/bpf/hooks.c, returns the hook's default value. Therefore,
directly returning the result of the bpf_lsm_lsmprop_to_secctx call
propagates an unchecked EOPNOTSUPP error.

Signed-off-by: Sebastian Bockholt <sebastian.bockholt@bevuta.com>
---
 security/security.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/security/security.c b/security/security.c
index 71aea8fdf014..9c63699d45fc 100644
--- a/security/security.c
+++ b/security/security.c
@@ -3954,12 +3954,16 @@ EXPORT_SYMBOL(security_secid_to_secctx);
 int security_lsmprop_to_secctx(struct lsm_prop *prop, struct lsm_context *cp,
 			       int lsmid)
 {
+	int error;
 	struct lsm_static_call *scall;
 
 	lsm_for_each_hook(scall, lsmprop_to_secctx) {
 		if (lsmid != LSM_ID_UNDEF && lsmid != scall->hl->lsmid->id)
 			continue;
-		return scall->hl->hook.lsmprop_to_secctx(prop, cp);
+		error = scall->hl->hook.lsmprop_to_secctx(prop, cp);
+		if (error == -EOPNOTSUPP)
+			continue;
+		return error;
 	}
 	return LSM_RET_DEFAULT(lsmprop_to_secctx);
 }
-- 
2.54.0


^ permalink raw reply related

* [PATCH 1/2] bpf: lsm: disable xfrm_decode_session hook attachment
From: Bradley Morgan @ 2026-06-19 13:03 UTC (permalink / raw)
  To: linux-security-module, bpf
  Cc: linux-kernel, Bradley Morgan, stable, KP Singh, Matt Bobrowski,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau,
	Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis,
	Florent Revest, Brendan Jackman

BPF LSM programs can currently attach to xfrm_decode_session(). That
hook may return an error, but security_skb_classify_flow() calls it
from a void path and triggers BUG_ON() if an error is returned.

Disable BPF attachment to the hook to prevent a BPF LSM program from
turning packet classification into a full panic.

Fixes: 9e4e01dfd325 ("bpf: lsm: Implement attach, detach and execution")
Cc: stable@vger.kernel.org
Signed-off-by: Bradley Morgan <include@grrlz.net>
---
 kernel/bpf/bpf_lsm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 564071a92d7d..1433809bb166 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -51,6 +51,9 @@ BTF_ID(func, bpf_lsm_key_getsecurity)
 #ifdef CONFIG_AUDIT
 BTF_ID(func, bpf_lsm_audit_rule_match)
 #endif
+#ifdef CONFIG_SECURITY_NETWORK_XFRM
+BTF_ID(func, bpf_lsm_xfrm_decode_session)
+#endif
 BTF_ID(func, bpf_lsm_ismaclabel)
 BTF_ID(func, bpf_lsm_file_alloc_security)
 BTF_SET_END(bpf_lsm_disabled_hooks)
-- 
2.53.0


^ permalink raw reply related

* [PATCH 2/2] lsm: fix size queries for getselfattr with NULL buffer
From: Bradley Morgan @ 2026-06-19 13:03 UTC (permalink / raw)
  To: linux-security-module, bpf
  Cc: linux-kernel, Bradley Morgan, stable, Paul Moore, James Morris,
	Serge E. Hallyn, Shuah Khan, linux-kselftest
In-Reply-To: <20260619130305.27779-1-include@grrlz.net>

The lsm_get_self_attr() syscall allows callers to pass in a NULL context
buffer to find out the size of the output needed. That path still
compared the computed entry size against the caller provided size first,
so a NULL buffer with size 0 incorrectly returned -E2BIG rather than
reporting the required size.

Only enforce the available buffer length after checking for the NULL
buffer. Cover the zero length sizing query in the self test.

Fixes: d7cf3412a9f6 ("lsm: consolidate buffer size handling into lsm_fill_user_ctx()")
Cc: stable@vger.kernel.org
Signed-off-by: Bradley Morgan <include@grrlz.net>
---
 security/security.c                                  | 8 ++++----
 tools/testing/selftests/lsm/lsm_get_self_attr_test.c | 5 ++---
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/security/security.c b/security/security.c
index 71aea8fdf014..fa0d7e036249 100644
--- a/security/security.c
+++ b/security/security.c
@@ -406,15 +406,15 @@ int lsm_fill_user_ctx(struct lsm_ctx __user *uctx, u32 *uctx_len,
 	int rc = 0;
 
 	nctx_len = ALIGN(struct_size(nctx, ctx, val_len), sizeof(void *));
+	/* no buffer - return success/0 and set @uctx_len to the req size */
+	if (!uctx)
+		goto out;
+
 	if (nctx_len > *uctx_len) {
 		rc = -E2BIG;
 		goto out;
 	}
 
-	/* no buffer - return success/0 and set @uctx_len to the req size */
-	if (!uctx)
-		goto out;
-
 	nctx = kzalloc(nctx_len, GFP_KERNEL);
 	if (nctx == NULL) {
 		rc = -ENOMEM;
diff --git a/tools/testing/selftests/lsm/lsm_get_self_attr_test.c b/tools/testing/selftests/lsm/lsm_get_self_attr_test.c
index 60caf8528f81..2f5ababc2b95 100644
--- a/tools/testing/selftests/lsm/lsm_get_self_attr_test.c
+++ b/tools/testing/selftests/lsm/lsm_get_self_attr_test.c
@@ -39,15 +39,14 @@ TEST(size_null_lsm_get_self_attr)
 
 TEST(ctx_null_lsm_get_self_attr)
 {
-	const long page_size = sysconf(_SC_PAGESIZE);
-	__u32 size = page_size;
+	__u32 size = 0;
 	int rc;
 
 	rc = lsm_get_self_attr(LSM_ATTR_CURRENT, NULL, &size, 0);
 
 	if (attr_lsm_count()) {
 		ASSERT_NE(-1, rc);
-		ASSERT_NE(1, size);
+		ASSERT_NE(0, size);
 	} else {
 		ASSERT_EQ(-1, rc);
 	}
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH bpf-next v3 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: Christian Brauner @ 2026-06-19 10:25 UTC (permalink / raw)
  To: David Windsor
  Cc: viro, jack, ast, daniel, john.fastabend, andrii, eddyz87, memxor,
	martin.lau, song, yonghong.song, jolsa, emil, kpsingh,
	mattbobrowski, paul, jmorris, serge, zohar, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, stephen.smalley.work, omosnace,
	casey, shuah, linux-kernel, linux-fsdevel, bpf,
	linux-security-module, linux-integrity, selinux, linux-kselftest
In-Reply-To: <20260618203411.73917-2-dwindsor@gmail.com>

On Thu, Jun 18, 2026 at 04:34:10PM -0400, David Windsor wrote:
> Add bpf_init_inode_xattr() kfunc for BPF LSM programs to atomically set
> xattrs via the inode_init_security hook using lsm_get_xattr_slot().
> 
> The inode_init_security hook previously took the xattr array and count
> as two separate output parameters (struct xattr *xattrs, int
> *xattr_count), which BPF programs cannot write to. Pass the xattr state
> as a single context object (struct xattr_ctx) instead, and have
> bpf_init_inode_xattr() take that context directly. Update the existing
> in-tree callers of inode_init_security to take and forward the new
> xattr_ctx.
> 
> A previous attempt [1] required a kmalloc string output protocol for
> the xattr name. Since commit 6bcdfd2cac55 ("security: Allow all LSMs to
> provide xattrs for inode_init_security hook") [2], the xattr name is no
> longer allocated; it is a static constant.
> 
> Because we rely on the hook-specific ctx layout, the kfunc is
> restricted to lsm/inode_init_security. Restrict the xattr names that
> may be set via this kfunc to the bpf.* namespace.
> 
> Link: https://kernsec.org/pipermail/linux-security-module-archive/2022-October/034878.html [1]
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6bcdfd2cac55 [2]
> Suggested-by: Song Liu <song@kernel.org>
> Signed-off-by: David Windsor <dwindsor@gmail.com>
> ---
>  fs/bpf_fs_kfuncs.c                | 106 +++++++++++++++++++++++++++++-

Please split this into the VFS changes and lsm changes required for
this. The api change to the lsm layer can be done independently of any
of the actual VFS level wiring. Will also make it a lot nicer to
review...

^ permalink raw reply

* Re: [PATCH v5 7/8] vfs: Replace security_sb_mount/security_move_mount with granular hooks
From: Christian Brauner @ 2026-06-19 10:17 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-security-module, linux-fsdevel, selinux, apparmor, paul,
	jmorris, serge, viro, jack, john.johansen, stephen.smalley.work,
	omosnace, mic, gnoack, takedakn, penguin-kernel, herton,
	kernel-team
In-Reply-To: <178179134209.111814.12159888808546010170.b4-reply@b4>

On Thu, Jun 18, 2026 at 04:02:22PM +0200, Christian Brauner wrote:
> On 2026-06-18 18:56:42+08:00, Song Liu wrote:
> > On Wed, Jun 17, 2026 at 9:53 PM Christian Brauner <brauner@kernel.org> wrote:
> > 
> > > On Thu, May 28, 2026 at 11:26:06AM -0700, Song Liu wrote:
> > 
> > [...]
> > 
> > > >
> > >
> > > This again is racy as it is called outside of the namespace semaphore:
> > >
> > >         err = security_mount_bind(&old_path, path, recurse);
> > >         if (err)
> > >                 return err;
> > >
> > >         if (mnt_ns_loop(old_path.dentry))
> > >                 return -EINVAL;
> > >
> > >         LOCK_MOUNT(mp, path);
> > >         if (IS_ERR(mp.parent))
> > >                 return PTR_ERR(mp.parent);
> > >
> > > After LOCK_MOUNT @path might point to a completely different mount then
> > > the one you performed your security checks on.
> > 
> > I thought we agreed at LSF/MM/BPF 2026 to add the LSM hooks
> > before taking namespace semaphore, so that it is possible for LSMs
> > to defend against DoS attacks on namespace semaphore? Did I
> > miss/misunderstand something?
> 
> I think there was a misunderstanding. What I pointed out was that it's a
> trade-off. If we do call security hooks under the namespace semaphore or
> mount lock than anything that's called under there must take care to not
> cause deadlocks - which is especially easy to do with mount lock and
> even with the namespace semaphore it may get hairy (automounts etc). The
> dos thing is another worry but if an LSM does stupid things we tell it
> to not do stupid things and to go away.
> 
> But as the hooks are done right now they are meaningless from a security
> perspective. You might have a policy that allows mounting on dentry_a
> and deny mounting on dentry_b: before LOCK_MOUNT*() you may see dentry_a
> and allow the mount but after LOCK_MOUNT*() someone raced you and shoved
> a dentry_b mount onto dentry_b and now you allow overmounting dentry_b

*a dentry_b mount onto dentry_a

^ permalink raw reply

* Re: [PATCH] selftests/landlock: explicitly disable audit
From: Maximilian Heyne @ 2026-06-19  9:09 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: stable, Günther Noack, Shuah Khan, linux-security-module,
	linux-kselftest, linux-kernel
In-Reply-To: <20260619.Ang7AiGeishu@digikod.net>

Hi Mickaël,

On Fri, Jun 19, 2026 at 10:32:45AM +0200, Mickaël Salaün wrote:
> I extended your patch and merged it:
> https://git.kernel.org/mic/c/next&id=0302cd72fe196aee933e3fb76f6d175d1ab0e843
> 
> Thanks!

Thank you! Sorry for the late response. Only yesterday I tried the
patches you pointed me at and they also helped in my setup. I was also
about to sent a patch regarding filtering out the domain deallocation
records but that was also covered by you already.

> 
> On Tue, Jun 09, 2026 at 12:51:03AM +0200, Mickaël Salaün wrote:
> > Thanks for this patch.  I merged a few fixes and I'd be interested to
> > know if this one fix the issue you spotted:
> > https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/commit/?h=next&id=d8dfb4c7faa87c3e41a8678f38f136c2c7c036fa
> > 
> > 
> > On Fri, May 29, 2026 at 08:03:41PM +0000, Maximilian Heyne wrote:
> > > I'm seeing sporadic selftest failures, such as
> > > 
> > >   #  RUN           scoped_audit.connect_to_child ...
> > >   # scoped_abstract_unix_test.c:314:connect_to_child:Expected 0 (0) == records.access (8)
> > >   # connect_to_child: Test failed
> > >   #          FAIL  scoped_audit.connect_to_child
> > >   not ok 19 scoped_audit.connect_to_child
> > > 
> > > This seems similar to what commit 3647a4977fb73d ("selftests/landlock:
> > > Drain stale audit records on init") tried to fix. However, the added
> > > drain loop is not effective. When setting the AUDIT_STATUS_PID, the
> > > kauditd_thread is woken up starting to send messages from the hold queue
> > > to the netlink. Depending on scheduling of this kthread not all messages
> > > might be send via the netlink in the 1 us interval.
> > > 
> > > Therefore, instead of trying to drain the queue, let's just disable
> > > audit when running non-audit tests or more precisely disable it after
> > > audit-tests. This way we won't generate any new audit message that could
> > > interfere with the other tests.
> > > 
> > > The comment saying that on process exit audit will be disabled is wrong.
> > > The closed file descriptor just causes an auditd_reset(), not a
> > > disablement. So future messages will be queued in the hold queue.
> > > 
> > > Cc: stable@vger.kernel.org
> > > Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
> > > Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
> > > ---
> > > 
> > > I've seen the failures on the 6.18 kernels but haven't tested on latest
> > > upstream. However, I still think this is an issue.
> > > 
> > > ---
> > >  tools/testing/selftests/landlock/audit.h | 13 +++++--------
> > >  1 file changed, 5 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
> > > index 834005b2b0f09..7842330875f53 100644
> > > --- a/tools/testing/selftests/landlock/audit.h
> > > +++ b/tools/testing/selftests/landlock/audit.h
> > > @@ -494,10 +494,9 @@ static int audit_init_filter_exe(struct audit_filter *filter, const char *path)
> > >  static int audit_cleanup(int audit_fd, struct audit_filter *filter)
> > 
> > audit_cleanup() should be called for audit_exec tests too.
> > 
> > >  {
> > >  	struct audit_filter new_filter;
> > > +	int err;
> > >  
> > >  	if (audit_fd < 0 || !filter) {
> > > -		int err;
> > > -
> > >  		/*
> > >  		 * Simulates audit_init_with_exe_filter() when called from
> > >  		 * FIXTURE_TEARDOWN_PARENT().
> > > @@ -518,12 +517,10 @@ static int audit_cleanup(int audit_fd, struct audit_filter *filter)
> > >  	audit_filter_exe(audit_fd, filter, AUDIT_DEL_RULE);
> > >  	audit_filter_drop(audit_fd, AUDIT_DEL_RULE);
> > >  
> > > -	/*
> > > -	 * Because audit_cleanup() might not be called by the test auditd
> > > -	 * process, it might not be possible to explicitly set it.  Anyway,
> > > -	 * AUDIT_STATUS_ENABLED will implicitly be set to 0 when the auditd
> > > -	 * process will exit.
> > > -	 */
> > 
> > Please add a comment that explains that the audit state is not restored
> > but just disabled.
> > 
> > > +	err = audit_set_status(audit_fd, AUDIT_STATUS_ENABLED, 0);
> > > +	if (err)
> > > +		return err;
> > > +
> > >  	return close(audit_fd);
> > 
> > FDs should always be closed.
> > 
> > >  }
> > >  
> > > -- 
> > > 2.50.1
> > > 
> > > 
> > > 
> > > 
> > > Amazon Web Services Development Center Germany GmbH
> > > Tamara-Danz-Str. 13
> > > 10243 Berlin
> > > Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
> > > Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
> > > Sitz: Berlin
> > > Ust-ID: DE 365 538 597
> > > 
> > > 



Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597


^ permalink raw reply

* [GIT PULL] Landlock update for v7.2-rc1
From: Mickaël Salaün @ 2026-06-19  8:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mickaël Salaün, Bryam Vargas, Günther Noack,
	Günther Noack, Justin Suess, Matthieu Buffet,
	Maximilian Heyne, Tingmao Wang, linux-kernel,
	linux-security-module

Hi,

This PR adds new Landlock access rights to control UDP bind and
connect/send operations, and a new "quiet" feature to mute specific
audit logs (and other future observability events).  A few commits also
fix Landlock issues.

Please pull these changes for v7.2-rc1 .  These commits merge cleanly
with your master branch.  Most kernel changes have been tested in the
latest linux-next releases for some weeks, and I waited a bit more since
last week to make sure the changes brought by the recently squashed
fixes are ok.

Test coverage for security/landlock is 91.5% of 2351 lines according to
LLVM 22, and it was 90.9% of 2176 lines before this PR.

syzkaller changes have been developed to cover these new features:
https://github.com/google/syzkaller/pull/7493

Regards,
 Mickaël

--
The following changes since commit 5d6919055dec134de3c40167a490f33c74c12581:

  Linux 7.1-rc3 (2026-05-10 14:08:09 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git tags/landlock-7.2-rc1

for you to fetch changes up to 1c236e7fe740a009ad8dd40a5ee0602ec402fffe:

  selftests/landlock: Add tests for invalid use of quiet flag (2026-06-14 20:17:25 +0200)

----------------------------------------------------------------
Landlock update for v7.2-rc1

----------------------------------------------------------------
Bryam Vargas (2):
      landlock: Fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
      selftests/landlock: Test SCOPE_SIGNAL on the SIGIO/fowner pgid path

Matthieu Buffet (7):
      landlock: Fix unmarked concurrent access to socket family
      landlock: Add UDP bind() access control
      landlock: Add UDP send+connect access control
      selftests/landlock: Add tests for UDP bind/connect
      selftests/landlock: Add tests for UDP send
      samples/landlock: Add sandboxer UDP access control
      landlock: Add documentation for UDP support

Maximilian Heyne (1):
      selftests/landlock: Explicitly disable audit in teardowns

Mickaël Salaün (5):
      selftests/landlock: Filter dealloc records in audit_count_records()
      selftests/landlock: Increase default audit socket timeout
      landlock: Set audit_net.sk for socket access checks
      landlock: Account all audit data allocations to user space
      landlock: Demonstrate best-effort allowed_access filtering

Tingmao Wang (9):
      landlock: Add a place for flags to layer rules
      landlock: Add API support and docs for the quiet flags
      landlock: Suppress logging when quiet flag is present
      samples/landlock: Add quiet flag support to sandboxer
      selftests/landlock: Replace hard-coded 16 with a constant
      selftests/landlock: Add tests for quiet flag with fs rules
      selftests/landlock: Add tests for quiet flag with net rules
      selftests/landlock: Add tests for quiet flag with scope
      selftests/landlock: Add tests for invalid use of quiet flag

 Documentation/admin-guide/LSM/landlock.rst         |   13 +-
 Documentation/userspace-api/landlock.rst           |  145 +-
 include/uapi/linux/landlock.h                      |   97 +-
 samples/landlock/sandboxer.c                       |  175 +-
 security/landlock/access.h                         |   44 +-
 security/landlock/audit.c                          |  292 ++-
 security/landlock/audit.h                          |    3 +-
 security/landlock/domain.c                         |   66 +-
 security/landlock/domain.h                         |   16 +-
 security/landlock/fs.c                             |  171 +-
 security/landlock/fs.h                             |   29 +-
 security/landlock/limits.h                         |    5 +-
 security/landlock/net.c                            |  185 +-
 security/landlock/net.h                            |    5 +-
 security/landlock/ruleset.c                        |   49 +-
 security/landlock/ruleset.h                        |   29 +-
 security/landlock/syscalls.c                       |   73 +-
 security/landlock/task.c                           |   11 +
 tools/testing/selftests/landlock/audit.h           |  140 +-
 tools/testing/selftests/landlock/audit_test.c      |   33 +-
 tools/testing/selftests/landlock/base_test.c       |  122 +-
 tools/testing/selftests/landlock/common.h          |    2 +
 tools/testing/selftests/landlock/fs_test.c         | 2445 +++++++++++++++++++-
 tools/testing/selftests/landlock/net_test.c        | 1392 ++++++++++-
 tools/testing/selftests/landlock/ptrace_test.c     |    1 +
 .../selftests/landlock/scoped_abstract_unix_test.c |   78 +-
 .../selftests/landlock/scoped_signal_test.c        |  182 ++
 27 files changed, 5368 insertions(+), 435 deletions(-)

^ permalink raw reply

* Re: [PATCH] selftests/landlock: explicitly disable audit
From: Mickaël Salaün @ 2026-06-19  8:32 UTC (permalink / raw)
  To: Maximilian Heyne
  Cc: stable, Günther Noack, Shuah Khan, linux-security-module,
	linux-kselftest, linux-kernel
In-Reply-To: <20260604.Gee4caexei8o@digikod.net>

I extended your patch and merged it:
https://git.kernel.org/mic/c/next&id=0302cd72fe196aee933e3fb76f6d175d1ab0e843

Thanks!

On Tue, Jun 09, 2026 at 12:51:03AM +0200, Mickaël Salaün wrote:
> Thanks for this patch.  I merged a few fixes and I'd be interested to
> know if this one fix the issue you spotted:
> https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/commit/?h=next&id=d8dfb4c7faa87c3e41a8678f38f136c2c7c036fa
> 
> 
> On Fri, May 29, 2026 at 08:03:41PM +0000, Maximilian Heyne wrote:
> > I'm seeing sporadic selftest failures, such as
> > 
> >   #  RUN           scoped_audit.connect_to_child ...
> >   # scoped_abstract_unix_test.c:314:connect_to_child:Expected 0 (0) == records.access (8)
> >   # connect_to_child: Test failed
> >   #          FAIL  scoped_audit.connect_to_child
> >   not ok 19 scoped_audit.connect_to_child
> > 
> > This seems similar to what commit 3647a4977fb73d ("selftests/landlock:
> > Drain stale audit records on init") tried to fix. However, the added
> > drain loop is not effective. When setting the AUDIT_STATUS_PID, the
> > kauditd_thread is woken up starting to send messages from the hold queue
> > to the netlink. Depending on scheduling of this kthread not all messages
> > might be send via the netlink in the 1 us interval.
> > 
> > Therefore, instead of trying to drain the queue, let's just disable
> > audit when running non-audit tests or more precisely disable it after
> > audit-tests. This way we won't generate any new audit message that could
> > interfere with the other tests.
> > 
> > The comment saying that on process exit audit will be disabled is wrong.
> > The closed file descriptor just causes an auditd_reset(), not a
> > disablement. So future messages will be queued in the hold queue.
> > 
> > Cc: stable@vger.kernel.org
> > Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
> > Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
> > ---
> > 
> > I've seen the failures on the 6.18 kernels but haven't tested on latest
> > upstream. However, I still think this is an issue.
> > 
> > ---
> >  tools/testing/selftests/landlock/audit.h | 13 +++++--------
> >  1 file changed, 5 insertions(+), 8 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
> > index 834005b2b0f09..7842330875f53 100644
> > --- a/tools/testing/selftests/landlock/audit.h
> > +++ b/tools/testing/selftests/landlock/audit.h
> > @@ -494,10 +494,9 @@ static int audit_init_filter_exe(struct audit_filter *filter, const char *path)
> >  static int audit_cleanup(int audit_fd, struct audit_filter *filter)
> 
> audit_cleanup() should be called for audit_exec tests too.
> 
> >  {
> >  	struct audit_filter new_filter;
> > +	int err;
> >  
> >  	if (audit_fd < 0 || !filter) {
> > -		int err;
> > -
> >  		/*
> >  		 * Simulates audit_init_with_exe_filter() when called from
> >  		 * FIXTURE_TEARDOWN_PARENT().
> > @@ -518,12 +517,10 @@ static int audit_cleanup(int audit_fd, struct audit_filter *filter)
> >  	audit_filter_exe(audit_fd, filter, AUDIT_DEL_RULE);
> >  	audit_filter_drop(audit_fd, AUDIT_DEL_RULE);
> >  
> > -	/*
> > -	 * Because audit_cleanup() might not be called by the test auditd
> > -	 * process, it might not be possible to explicitly set it.  Anyway,
> > -	 * AUDIT_STATUS_ENABLED will implicitly be set to 0 when the auditd
> > -	 * process will exit.
> > -	 */
> 
> Please add a comment that explains that the audit state is not restored
> but just disabled.
> 
> > +	err = audit_set_status(audit_fd, AUDIT_STATUS_ENABLED, 0);
> > +	if (err)
> > +		return err;
> > +
> >  	return close(audit_fd);
> 
> FDs should always be closed.
> 
> >  }
> >  
> > -- 
> > 2.50.1
> > 
> > 
> > 
> > 
> > Amazon Web Services Development Center Germany GmbH
> > Tamara-Danz-Str. 13
> > 10243 Berlin
> > Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
> > Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
> > Sitz: Berlin
> > Ust-ID: DE 365 538 597
> > 
> > 

^ permalink raw reply

* [PATCH] landlock: work around gcc-16 -Wuninitialized warning
From: Arnd Bergmann @ 2026-06-19  8:21 UTC (permalink / raw)
  To: Mickaël Salaün, Paul Moore, James Morris,
	Serge E. Hallyn, Tingmao Wang, Justin Suess
  Cc: Arnd Bergmann, Günther Noack, linux-security-module,
	linux-kernel

From: Arnd Bergmann <arnd@arndb.de>

gcc has a bug with -ftrivial-auto-var-init=pattern that produces a
warning for correct code that uses sparse bitfields:

security/landlock/fs.c: In function 'is_access_to_paths_allowed.isra':
security/landlock/fs.c:767:28: error: '_layer_masks_child1' is used uninitialized [-Werror=uninitialized]
  767 |         struct layer_masks _layer_masks_child1, _layer_masks_child2;
      |                            ^~~~~~~~~~~~~~~~~~~
security/landlock/fs.c:767:28: note: '_layer_masks_child1' declared here
  767 |         struct layer_masks _layer_masks_child1, _layer_masks_child2;
      |                            ^~~~~~~~~~~~~~~~~~~
security/landlock/fs.c: In function 'hook_unix_find':
security/landlock/fs.c:1649:28: error: 'layer_masks' is used uninitialized [-Werror=uninitialized]
 1649 |         struct layer_masks layer_masks;
      |                            ^~~~~~~~~~~
security/landlock/fs.c:1649:28: note: 'layer_masks' declared here
 1649 |         struct layer_masks layer_masks;
      |                            ^~~~~~~~~~~

To work around this, change the definition of struct layer_mask to
use an explictit padding field. This also avoids the extra attributes
for aligning the structure.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110743
Fixes: a260c0055665 ("landlock: Add a place for flags to layer rules")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 security/landlock/access.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/security/landlock/access.h b/security/landlock/access.h
index d926078bf0a5..89ab9fbcebe4 100644
--- a/security/landlock/access.h
+++ b/security/landlock/access.h
@@ -81,7 +81,10 @@ struct layer_mask {
 	 */
 	access_mask_t quiet : 1;
 #endif /* CONFIG_AUDIT */
-} __packed __aligned(sizeof(access_mask_t));
+	access_mask_t __pad : ((sizeof(access_mask_t) * 8) -
+				LANDLOCK_NUM_ACCESS_MAX -
+				IS_ENABLED(CONFIG_AUDIT));
+};
 
 /*
  * Make sure that we don't increase the size of struct layer_mask when storing
-- 
2.39.5


^ permalink raw reply related

* Re: [PATCH v5 7/8] vfs: Replace security_sb_mount/security_move_mount with granular hooks
From: Song Liu @ 2026-06-19  6:14 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-security-module, linux-fsdevel, selinux, apparmor, paul,
	jmorris, serge, viro, jack, john.johansen, stephen.smalley.work,
	omosnace, mic, gnoack, takedakn, penguin-kernel, herton,
	kernel-team
In-Reply-To: <178179134209.111814.12159888808546010170.b4-reply@b4>

On Thu, Jun 18, 2026 at 10:04 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On 2026-06-18 18:56:42+08:00, Song Liu wrote:
> > On Wed, Jun 17, 2026 at 9:53 PM Christian Brauner <brauner@kernel.org> wrote:
> >
> > > On Thu, May 28, 2026 at 11:26:06AM -0700, Song Liu wrote:
> >
> > [...]
> >
> > > >
> > >
> > > This again is racy as it is called outside of the namespace semaphore:
> > >
> > >         err = security_mount_bind(&old_path, path, recurse);
> > >         if (err)
> > >                 return err;
> > >
> > >         if (mnt_ns_loop(old_path.dentry))
> > >                 return -EINVAL;
> > >
> > >         LOCK_MOUNT(mp, path);
> > >         if (IS_ERR(mp.parent))
> > >                 return PTR_ERR(mp.parent);
> > >
> > > After LOCK_MOUNT @path might point to a completely different mount then
> > > the one you performed your security checks on.
> >
> > I thought we agreed at LSF/MM/BPF 2026 to add the LSM hooks
> > before taking namespace semaphore, so that it is possible for LSMs
> > to defend against DoS attacks on namespace semaphore? Did I
> > miss/misunderstand something?
>
> I think there was a misunderstanding. What I pointed out was that it's a
> trade-off. If we do call security hooks under the namespace semaphore or
> mount lock than anything that's called under there must take care to not
> cause deadlocks - which is especially easy to do with mount lock and
> even with the namespace semaphore it may get hairy (automounts etc). The
> dos thing is another worry but if an LSM does stupid things we tell it
> to not do stupid things and to go away.
>
> But as the hooks are done right now they are meaningless from a security
> perspective. You might have a policy that allows mounting on dentry_a
> and deny mounting on dentry_b: before LOCK_MOUNT*() you may see dentry_a
> and allow the mount but after LOCK_MOUNT*() someone raced you and shoved
> a dentry_b mount onto dentry_b and now you allow overmounting dentry_b
> which your policy didn't allow -> hosed.

So the direction here is to move security_mount_bind() after
LOCK_MOUNT(mp, path)? This should be easy to fix.

> > > Placement of this hook suffers from the same issue as the bind mount
> > > hook. Here it's worse because the security layer isn't even informed
> > > about MOVE_MOUNT_BENEATH which completely alters the mount relationship.
> >
> > Current hook security_move_mount doesn't handle
> > MOVE_MOUNT_BENEATH. But we can add mflags to security_mount_move().
> > Do we need anything other than mflags?
>
> I think you either need to pass three mounts (source, target, top_mnt)
> where for non-mount beneath target == top_mnt or you need two separate
> hooks. Because for MOVE_MOUNT_BENEATH you may want to have a tri-part
> policy: source, target, top_mnt.

One hook with (source, target, top_mnt) seems easier here. But let me take a
closer look at this.

Thanks,
Song

^ permalink raw reply

* Re: [RFC PATCH 1/2] landlock: fix TCP Fast Open connection bypass
From: Bryam Vargas @ 2026-06-19  1:39 UTC (permalink / raw)
  To: Matthieu Buffet
  Cc: Mickaël Salaün, Günther Noack, Mikhail Ivanov,
	Paul Moore, Eric Dumazet, Neal Cardwell, linux-security-module,
	netdev, linux-kernel
In-Reply-To: <e0c7b502-931e-481e-89b0-b47687d2b942@buffet.re>

Thanks, that settles it: MPTCP is out of scope by design, not a gap.

I read 854277e2cc8c ("landlock: Fix non-TCP sockets restriction"). It
changed the sock->type != SOCK_STREAM test to !sk_is_tcp(sock->sk),
dropping SMC/MPTCP/SCTP from the TCP rights on purpose, and 3d4033985ff5
pins that with a "MPTCP actions are not restricted" selftest. So my
"|| sk_protocol == IPPROTO_MPTCP" suggestion was wrong: it would revert
that decision and break the selftest. Please disregard it.

That leaves the series complete as-is on this axis. Keeping both the v0
guard and the 2/2 selftest sk_is_tcp()-only is correct, and the
Tested-by stands for the TCP and IPv6 fast-open path the patch fixes.

Bryam


^ permalink raw reply

* AppArmor: TCP Fast Open bypasses connect mediation (last unaddressed LSM)
From: Bryam Vargas @ 2026-06-19  1:11 UTC (permalink / raw)
  To: John Johansen, linux-security-module, apparmor
  Cc: Paul Moore, James Morris, Serge E . Hallyn, Mickael Salaun,
	Stephen Smalley, Matthieu Buffet, Mikhail Ivanov, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

Hello John, and LSM folks,

I have been working on the Landlock TCP Fast Open connect bypass [1]. Stephen
Smalley's SELinux fix for the same issue [3] -- "Similar to Landlock, SELinux was
not updated when TCP Fast Open support was introduced ..." -- made me go back and
check the rest of the connect-mediating LSMs, since I had only been looking at
Landlock. With Landlock [2], SELinux [3], and now TOMOYO [4] all getting fixes,
AppArmor is the last one with the same gap and no fix yet.

Root cause (shared with the others)
-----------------------------------
security_socket_connect() has a single call site, net/socket.c (the connect(2)
syscall). TCP Fast Open performs an implicit connect inside sendmsg:

  tcp_sendmsg -> tcp_sendmsg_fastopen -> __inet_stream_connect(..., is_sendmsg=1)
              -> sk->sk_prot->connect()                 net/ipv4/{tcp.c,af_inet.c}

This never calls security_socket_connect(); the only LSM hook on the path is
security_socket_sendmsg(). mptcp_sendmsg_fastopen reaches the same code and is a
second producer.

AppArmor
--------
apparmor_socket_connect() requests AA_MAY_CONNECT; apparmor_socket_sendmsg() (via
aa_sock_msg_perm) requests AA_MAY_SEND. These are distinct bits, and apparmor_parser
compiles them independently: "network send inet stream," yields accept mask 0x02
while "network connect inet stream," yields 0x40. So an egress-restriction profile
that grants send but not connect is bypassed by MSG_FASTOPEN.

Reproduced on 6.12.88 with apparmor active. Under a profile granting the inet/inet6
stream lifecycle except connect:

  aa-exec -p egress_restricted -- ./probe
  [TCP ] connect(2)=EACCES(blocked)  sendto(MSG_FASTOPEN)=OK(reached)  => connection established
  [TCP6] connect(2)=EACCES(blocked)  sendto(MSG_FASTOPEN)=OK(reached)  => connection established

(The coarse "network inet stream," idiom grants connect anyway, so this only bites the
fine-grained "allow send, deny connect" policy that the asymmetry is meant to serve.)

Fix
---
Same shape as the TOMOYO [4] and SELinux [3] fixes: in apparmor_socket_sendmsg (or
aa_sock_msg_perm), when MSG_FASTOPEN is set and msg_name carries a destination on a
not-yet-connected stream socket, additionally require aa_sk_perm(OP_CONNECT,
AA_MAY_CONNECT, sk). I am happy to send that patch and the reproducer.

(A single core check in __inet_stream_connect(), gated on is_sendmsg, would have
covered all five LSMs and both the TCP and MPTCP producers in one place -- the kernel
already mediates the analogous implicit-connect-on-send for AF_UNIX via
security_unix_may_send and for SCTP via security_sctp_bind_connect. But since the
other four LSMs are taking per-hook fixes, AppArmor matching them is the consistent
move; mentioning the core option only in case it is preferred.)

[1] Landlock: LANDLOCK_ACCESS_NET_CONNECT_TCP bypass via TCP Fast Open (report)
    https://lore.kernel.org/r/20260616201615.275032-1-hexlabsecurity@proton.me
[2] landlock: fix TCP Fast Open connection bypass (Matthieu Buffet)
    https://lore.kernel.org/r/20260617180526.15627-2-matthieu@buffet.re
[3] selinux: check connect-related permissions on TCP Fast Open (Stephen Smalley)
    https://lore.kernel.org/r/20260618175513.112443-2-stephen.smalley.work@gmail.com
[4] tomoyo: Enforce connect policy in TCP Fast Open (Matthieu Buffet)
    https://lore.kernel.org/r/20260619002207.61104-1-matthieu@buffet.re

Thanks,
Bryam Vargas


^ permalink raw reply

* Re: [RFC PATCH 1/2] landlock: fix TCP Fast Open connection bypass
From: Matthieu Buffet @ 2026-06-19  0:34 UTC (permalink / raw)
  To: Bryam Vargas
  Cc: Mickaël Salaün, Günther Noack, Mikhail Ivanov,
	Paul Moore, Eric Dumazet, Neal Cardwell, linux-security-module,
	netdev, linux-kernel
In-Reply-To: <20260618012527.34964-1-hexlabsecurity@proton.me>

Hi Bryam,

On 6/18/2026 3:25 AM, Bryam Vargas wrote:
> One scope note, since you mention MPTCP: an MPTCP socket isn't covered.
> sk_is_tcp() is false for the mptcp parent (sk_protocol is IPPROTO_MPTCP), so
> neither the new sendmsg hook nor the existing socket_connect one mediates it. On
> the patched kernel my MPTCP arm still reaches the blocked port via both connect()
> and MSG_FASTOPEN. If MPTCP is meant to be in scope for CONNECT_TCP, the guard
> wants `|| sk->sk_protocol == IPPROTO_MPTCP` (not sk_is_mptcp(), which is the
> subflow flag).

Indeed, the patch does not try to filter MPTCP: it is not meant to be in 
the scope of LANDLOCK_ACCESS_NET_*_TCP rights.
It used to be, but it was a bug, see:
https://lore.kernel.org/all/20250205093651.1424339-2-ivanov.mikhail1@huawei-partners.com/

Have a nice day!

-- 
Matthieu

^ permalink raw reply

* [PATCH] tomoyo: Enforce connect policy in TCP Fast Open
From: Matthieu Buffet @ 2026-06-19  0:22 UTC (permalink / raw)
  To: Kentaro Takeda, Tetsuo Handa
  Cc: Bryam Vargas, Mickaël Salaün, Günther Noack,
	linux-security-module, Mikhail Ivanov, Paul Moore, Yuchung Cheng,
	Eric Dumazet, netdev, Matthieu Buffet

Tomoyo restricted TCP connections in 2011 in commit
059d84dbb389 ("TOMOYO: Add socket operation restriction support.")
using the socket_connect() LSM hook.

However, the MSG_FASTOPEN sendmsg() flag was added in 2012 to allow
combining connect() and the first sendmsg(). Tomoyo was not updated to
take this into account in its send hook.

This resulted in a TCP connect policy bypass similar to that reported in
Landlock in 2024 (see Link below), with the difference that Tomoyo was
fine when originally merged, and the problem got introduced when adding
fastopen support, possibly due to lack of synchronization between lsm
and netdev worlds.

Add MSG_FASTOPEN handling in Tomoyo's existing send hook.

Link: https://github.com/landlock-lsm/linux/issues/41
Link: https://lore.kernel.org/all/20260616201615.275032-1-hexlabsecurity@proton.me/
Fixes: cf60af03ca4e ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
Cc: stable@kernel.org
Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
---
 security/tomoyo/network.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/security/tomoyo/network.c b/security/tomoyo/network.c
index cfc2a019de1e..7d9ba7268dc2 100644
--- a/security/tomoyo/network.c
+++ b/security/tomoyo/network.c
@@ -764,11 +764,25 @@ int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg,
 	struct tomoyo_addr_info address;
 	const u8 family = tomoyo_sock_family(sock->sk);
 	const unsigned int type = sock->type;
+	int ret;
 
+	address.protocol = type;
+
+	if ((msg->msg_flags & MSG_FASTOPEN) != 0 && msg->msg_name != NULL &&
+	    (sk_is_tcp(sock->sk) ||
+	     (sk_is_inet(sock->sk) && type == SOCK_STREAM &&
+	      sock->sk->sk_protocol == IPPROTO_MPTCP))) {
+		address.operation = TOMOYO_NETWORK_CONNECT;
+		ret = tomoyo_check_inet_address(
+			(struct sockaddr *)msg->msg_name, msg->msg_namelen,
+			sock->sk->sk_protocol, &address);
+		if (ret != 0)
+			return ret;
+	}
 	if (!msg->msg_name || !family ||
 	    (type != SOCK_DGRAM && type != SOCK_RAW))
 		return 0;
-	address.protocol = type;
+
 	address.operation = TOMOYO_NETWORK_SEND;
 	if (family == PF_UNIX)
 		return tomoyo_check_unix_address((struct sockaddr *)
-- 
2.47.3


^ permalink raw reply related

* Re: [PATCH bpf-next v3 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: David Windsor @ 2026-06-18 22:50 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: viro, brauner, jack, ast, daniel, john.fastabend, andrii, eddyz87,
	memxor, martin.lau, song, yonghong.song, jolsa, emil, kpsingh,
	mattbobrowski, paul, jmorris, serge, zohar, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, stephen.smalley.work, omosnace,
	casey, shuah, linux-kernel, linux-fsdevel, bpf,
	linux-security-module, linux-integrity, selinux, linux-kselftest,
	martin.lau, clm, ihor.solodrai
In-Reply-To: <e9ec622a8a49be299cbbfd459e2fc10693bcd65c6dbf00c9e2645281564e5579@mail.kernel.org>

On Thu, Jun 18, 2026 at 5:22 PM <bot+bpf-ci@kernel.org> wrote:
>
> [ ... ]
> >
> > +struct xattr_ctx {
> > +     struct xattr *xattrs;
> > +     int *xattr_count;
> > +};
> > +
>
> Paul Moore suggested in v2 renaming this to 'struct lsm_xattrs' to avoid
> overloading the term 'ctx' in the LSM space. The struct was renamed from
> 'lsm_xattr_ctx' to 'xattr_ctx' in v3, but 'ctx' is still in the name. Was
> there a reason to keep 'ctx' in the name, or would renaming to 'lsm_xattrs'
> as Paul suggested be preferable?
>

Indeed he did; I mistakenly inverted his original request.

Will send a fix for this, as well as converting GFP_KERNEL to GFP_NOFS
in the allocation in __bpf_init_inode_xattr.


> >  /* Default (no) options for the capable function */
> >  #define CAP_OPT_NONE 0x0
> >  /* If capable should audit the security request */
>
> [ ... ]
>
> > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > index 564071a92d7d..86a8e188b900 100644
> > --- a/kernel/bpf/bpf_lsm.c
> > +++ b/kernel/bpf/bpf_lsm.c
> > @@ -113,6 +113,9 @@ void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> >  }
> >  #endif
> >
> > +BTF_ID_LIST_SINGLE(bpf_lsm_inode_init_security_btf_ids, func,
> > +                bpf_lsm_inode_init_security)
> > +
> >  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> >                       const struct bpf_prog *prog)
> >  {
> > @@ -137,6 +140,12 @@ int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> >               return -EINVAL;
> >       }
> >
> > +     /* bpf reserves a fixed number of xattr slots for itself.
> > +      * Set the attach limit so the trampoline rejects excess attaches.
> > +      */
>
> This isn't a bug, but a subsystem pattern flags this as potentially
> concerning: the BPF subsystem comment-style rule for files under kernel/bpf/
> requires multi-line comments to have the opening /* on its own line, with
> comment text starting on the next line. The preferred form would be:
>
>         /*
>          * bpf reserves a fixed number of xattr slots for itself.
>          * Set the attach limit so the trampoline rejects excess attaches.
>          */
>
> > +     if (btf_id == bpf_lsm_inode_init_security_btf_ids[0])
> > +             prog->aux->attach_limit = BPF_LSM_INODE_INIT_XATTRS;
> > +
> >       return 0;
> >  }
> >
>
> [ ... ]
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/27788616397

^ permalink raw reply

* Re: [PATCH] KEYS: avoid filesystem reclaim while holding keyring->sem
From: Eric Biggers @ 2026-06-18 22:32 UTC (permalink / raw)
  To: Mohammed EL Kadiri
  Cc: dhowells, jarkko, paul, jmorris, serge, keyrings,
	linux-security-module, linux-kernel, stable, syzkaller-bugs,
	syzbot+f55b043dacf43776b50c
In-Reply-To: <20260614150041.21172-1-med08elkadiri@gmail.com>

On Sun, Jun 14, 2026 at 04:00:41PM +0100, Mohammed EL Kadiri wrote:
> __key_link_begin() runs with keyring->sem held and calls
> assoc_array_insert(), which does GFP_KERNEL allocations.  Those
> allocations may enter filesystem reclaim, evict an fscrypt-protected
> inode, and reach keyring_clear() via fscrypt_put_master_key() --
> taking a keyring semaphore of the same lockdep class and closing a
> keyring->sem -> fs_reclaim -> keyring->sem cycle reported by syzbot.
> 
> Wrap the assoc_array_insert() call with memalloc_nofs_save() /
> memalloc_nofs_restore() so reclaim cannot recurse into the keys
> subsystem while keyring->sem is held.
> 
> Reported-by: syzbot+f55b043dacf43776b50c@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=f55b043dacf43776b50c
> Fixes: d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mohammed EL Kadiri <med08elkadiri@gmail.com>

My patch "fscrypt: Replace mk_users keyring with simple list"
(https://lore.kernel.org/linux-fscrypt/20260618221921.87896-1-ebiggers@kernel.org/)
fixes this lockdep false positive by making fscrypt no longer use
'struct key' keyrings to keep track of user claims to fscrypt master
keys.  That eliminates the need to clear such keyrings during filesystem
reclaim.

So this patch to security/keys/keyring.c isn't needed, unless there's
another reason for it.

- Eric

^ permalink raw reply

* Re: [PATCH bpf-next v3 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: bot+bpf-ci @ 2026-06-18 21:22 UTC (permalink / raw)
  To: dwindsor, viro, brauner, jack, ast, daniel, john.fastabend,
	andrii, eddyz87, memxor, martin.lau, song, yonghong.song, jolsa,
	emil, kpsingh, mattbobrowski, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg,
	stephen.smalley.work, omosnace, casey, shuah
  Cc: linux-kernel, linux-fsdevel, bpf, linux-security-module,
	linux-integrity, selinux, linux-kselftest, dwindsor, ast, andrii,
	daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai
In-Reply-To: <20260618203411.73917-2-dwindsor@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2507 bytes --]

> diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
> index 768aca2dc0f0..7abc3f3d1a67 100644
> --- a/fs/bpf_fs_kfuncs.c
> +++ b/fs/bpf_fs_kfuncs.c

[ ... ]

> diff --git a/include/linux/security.h b/include/linux/security.h
> index 153e9043058f..1f8e84e7dd7e 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -68,6 +68,11 @@ struct watch;
>  struct watch_notification;
>  struct lsm_ctx;
>
> +struct xattr_ctx {
> +	struct xattr *xattrs;
> +	int *xattr_count;
> +};
> +

Paul Moore suggested in v2 renaming this to 'struct lsm_xattrs' to avoid
overloading the term 'ctx' in the LSM space. The struct was renamed from
'lsm_xattr_ctx' to 'xattr_ctx' in v3, but 'ctx' is still in the name. Was
there a reason to keep 'ctx' in the name, or would renaming to 'lsm_xattrs'
as Paul suggested be preferable?

>  /* Default (no) options for the capable function */
>  #define CAP_OPT_NONE 0x0
>  /* If capable should audit the security request */

[ ... ]

> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index 564071a92d7d..86a8e188b900 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -113,6 +113,9 @@ void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
>  }
>  #endif
>
> +BTF_ID_LIST_SINGLE(bpf_lsm_inode_init_security_btf_ids, func,
> +		   bpf_lsm_inode_init_security)
> +
>  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
>  			const struct bpf_prog *prog)
>  {
> @@ -137,6 +140,12 @@ int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
>  		return -EINVAL;
>  	}
>
> +	/* bpf reserves a fixed number of xattr slots for itself.
> +	 * Set the attach limit so the trampoline rejects excess attaches.
> +	 */

This isn't a bug, but a subsystem pattern flags this as potentially
concerning: the BPF subsystem comment-style rule for files under kernel/bpf/
requires multi-line comments to have the opening /* on its own line, with
comment text starting on the next line. The preferred form would be:

	/*
	 * bpf reserves a fixed number of xattr slots for itself.
	 * Set the attach limit so the trampoline rejects excess attaches.
	 */

> +	if (btf_id == bpf_lsm_inode_init_security_btf_ids[0])
> +		prog->aux->attach_limit = BPF_LSM_INODE_INIT_XATTRS;
> +
>  	return 0;
>  }
>

[ ... ]


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/27788616397

^ permalink raw reply

* [PATCH bpf-next v3 2/2] selftests/bpf: add tests for bpf_init_inode_xattr kfunc
From: David Windsor @ 2026-06-18 20:34 UTC (permalink / raw)
  To: viro, brauner, jack, ast, daniel, john.fastabend, andrii, eddyz87,
	memxor, martin.lau, song, yonghong.song, jolsa, emil, kpsingh,
	mattbobrowski, paul, jmorris, serge, zohar, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, stephen.smalley.work, omosnace,
	casey, shuah
  Cc: linux-kernel, linux-fsdevel, bpf, linux-security-module,
	linux-integrity, selinux, linux-kselftest, David Windsor
In-Reply-To: <20260618203411.73917-1-dwindsor@gmail.com>

Test bpf atomic inode xattr labeling in inode_init_security.

Signed-off-by: David Windsor <dwindsor@gmail.com>
---
 tools/testing/selftests/bpf/bpf_kfuncs.h      |   5 +
 .../selftests/bpf/prog_tests/fs_kfuncs.c      | 105 +++++++++++++++++-
 .../bpf/progs/test_init_inode_xattr.c         |  31 ++++++
 3 files changed, 140 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_init_inode_xattr.c

diff --git a/tools/testing/selftests/bpf/bpf_kfuncs.h b/tools/testing/selftests/bpf/bpf_kfuncs.h
index ae71e9b69051..69d3641ee2d8 100644
--- a/tools/testing/selftests/bpf/bpf_kfuncs.h
+++ b/tools/testing/selftests/bpf/bpf_kfuncs.h
@@ -92,4 +92,9 @@ extern int bpf_set_dentry_xattr(struct dentry *dentry, const char *name__str,
 				const struct bpf_dynptr *value_p, int flags) __ksym __weak;
 extern int bpf_remove_dentry_xattr(struct dentry *dentry, const char *name__str) __ksym __weak;
 
+struct xattr_ctx;
+extern int bpf_init_inode_xattr(struct xattr_ctx *xattr_ctx,
+				const char *name__str,
+				const struct bpf_dynptr *value_p) __ksym __weak;
+
 #endif
diff --git a/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c b/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
index 43a26ec69a8e..0898898fb125 100644
--- a/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
+++ b/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
@@ -9,9 +9,10 @@
 #include <test_progs.h>
 #include "test_get_xattr.skel.h"
 #include "test_set_remove_xattr.skel.h"
+#include "test_init_inode_xattr.skel.h"
 #include "test_fsverity.skel.h"
 
-static const char testfile[] = "/tmp/test_progs_fs_kfuncs";
+static const char testfile[] = "/tmp/labelme";
 
 static void test_get_xattr(const char *name, const char *value, bool allow_access)
 {
@@ -268,6 +269,102 @@ static void test_fsverity(void)
 	remove(testfile);
 }
 
+static void test_init_inode_xattr(void)
+{
+	struct test_init_inode_xattr *skel = NULL;
+	int fd = -1, err;
+	char value_out[64];
+	const char *testfile_new = "/tmp/test_progs_fs_kfuncs_new";
+
+	skel = test_init_inode_xattr__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "test_init_inode_xattr__open_and_load"))
+		return;
+
+	skel->bss->monitored_pid = getpid();
+	err = test_init_inode_xattr__attach(skel);
+	if (!ASSERT_OK(err, "test_init_inode_xattr__attach"))
+		goto out;
+
+	/* Trigger inode_init_security */
+	fd = open(testfile_new, O_CREAT | O_RDWR, 0644);
+	if (!ASSERT_GE(fd, 0, "create_file"))
+		goto out;
+
+	ASSERT_EQ(skel->data->init_result, 0, "init_result");
+
+	/* initxattrs prepends "security." to the name. */
+	err = getxattr(testfile_new, "security.bpf.test_label", value_out,
+		       sizeof(value_out));
+	if (err < 0 && errno == ENODATA) {
+		printf("%s:SKIP:filesystem did not apply LSM xattrs\n",
+		       __func__);
+		test__skip();
+		goto out;
+	}
+	if (!ASSERT_GE(err, 0, "getxattr"))
+		goto out;
+
+	ASSERT_EQ(err, (int)sizeof(skel->data->xattr_value), "xattr_size");
+	ASSERT_EQ(strncmp(value_out, "unconfined_u:object_r:user_home_t:s0",
+			  sizeof("unconfined_u:object_r:user_home_t:s0")), 0,
+		  "xattr_value");
+
+out:
+	close(fd);
+	test_init_inode_xattr__destroy(skel);
+	remove(testfile_new);
+}
+
+/* Keep in sync with BPF_LSM_INODE_INIT_XATTRS in include/linux/bpf_lsm.h. */
+#define INIT_INODE_XATTR_MAX 4
+
+/* At most INIT_INODE_XATTR_MAX programs can attach to inode_init_security. */
+static void test_init_inode_xattr_attach_cap(void)
+{
+	struct test_init_inode_xattr *skel[INIT_INODE_XATTR_MAX + 1] = {};
+	struct bpf_link *link[INIT_INODE_XATTR_MAX + 1] = {};
+	struct bpf_link *extra = NULL;
+	int i, err;
+
+	/* Fill all available xattr slots */
+	for (i = 0; i < INIT_INODE_XATTR_MAX; i++) {
+		skel[i] = test_init_inode_xattr__open_and_load();
+		if (!ASSERT_OK_PTR(skel[i], "open_and_load"))
+			goto out;
+
+		link[i] = bpf_program__attach_lsm(skel[i]->progs.test_init_inode_xattr);
+		if (!ASSERT_OK_PTR(link[i], "attach_within_cap"))
+			goto out;
+	}
+
+	skel[INIT_INODE_XATTR_MAX] = test_init_inode_xattr__open_and_load();
+	if (!ASSERT_OK_PTR(skel[INIT_INODE_XATTR_MAX], "open_and_load_extra"))
+		goto out;
+
+	/* New additions fail with -E2BIG */
+	extra = bpf_program__attach_lsm(skel[INIT_INODE_XATTR_MAX]->progs.test_init_inode_xattr);
+	err = -errno;
+	if (!ASSERT_ERR_PTR(extra, "attach_over_cap_should_fail")) {
+		bpf_link__destroy(extra);
+		goto out;
+	}
+	ASSERT_EQ(err, -E2BIG, "attach_over_cap_errno");
+
+	bpf_link__destroy(link[0]);
+	link[0] = NULL; /* avoid double free in cleanup */
+
+	/* Freeing a slot lets the extra program attach */
+	extra = bpf_program__attach_lsm(skel[INIT_INODE_XATTR_MAX]->progs.test_init_inode_xattr);
+	ASSERT_OK_PTR(extra, "attach_after_detach");
+
+out:
+	bpf_link__destroy(extra);
+	for (i = 0; i <= INIT_INODE_XATTR_MAX; i++) {
+		bpf_link__destroy(link[i]);
+		test_init_inode_xattr__destroy(skel[i]);
+	}
+}
+
 void test_fs_kfuncs(void)
 {
 	/* Matches xattr_names in progs/test_get_xattr.c */
@@ -286,6 +383,12 @@ void test_fs_kfuncs(void)
 	if (test__start_subtest("set_remove_xattr"))
 		test_set_remove_xattr();
 
+	if (test__start_subtest("init_inode_xattr"))
+		test_init_inode_xattr();
+
+	if (test__start_subtest("init_inode_xattr_attach_cap"))
+		test_init_inode_xattr_attach_cap();
+
 	if (test__start_subtest("fsverity"))
 		test_fsverity();
 }
diff --git a/tools/testing/selftests/bpf/progs/test_init_inode_xattr.c b/tools/testing/selftests/bpf/progs/test_init_inode_xattr.c
new file mode 100644
index 000000000000..6f0e8b02ff88
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_init_inode_xattr.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Cisco Systems, Inc. */
+
+#include "vmlinux.h"
+#include <bpf/bpf_tracing.h>
+#include "bpf_kfuncs.h"
+
+char _license[] SEC("license") = "GPL";
+
+__u32 monitored_pid;
+int init_result = -1;
+
+static const char xattr_name[] = "bpf.test_label";
+char xattr_value[] = "unconfined_u:object_r:user_home_t:s0";
+
+SEC("lsm.s/inode_init_security")
+int BPF_PROG(test_init_inode_xattr, struct inode *inode, struct inode *dir,
+	     const struct qstr *qstr, struct xattr_ctx *xattr_ctx)
+{
+	struct bpf_dynptr value_ptr;
+	__u32 pid;
+
+	pid = bpf_get_current_pid_tgid() >> 32;
+	if (pid != monitored_pid)
+		return 0;
+
+	bpf_dynptr_from_mem(xattr_value, sizeof(xattr_value), 0, &value_ptr);
+	init_result = bpf_init_inode_xattr(xattr_ctx, xattr_name, &value_ptr);
+
+	return 0;
+}
-- 
2.53.0


^ permalink raw reply related

* [PATCH bpf-next v3 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: David Windsor @ 2026-06-18 20:34 UTC (permalink / raw)
  To: viro, brauner, jack, ast, daniel, john.fastabend, andrii, eddyz87,
	memxor, martin.lau, song, yonghong.song, jolsa, emil, kpsingh,
	mattbobrowski, paul, jmorris, serge, zohar, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, stephen.smalley.work, omosnace,
	casey, shuah
  Cc: linux-kernel, linux-fsdevel, bpf, linux-security-module,
	linux-integrity, selinux, linux-kselftest, David Windsor
In-Reply-To: <20260618203411.73917-1-dwindsor@gmail.com>

Add bpf_init_inode_xattr() kfunc for BPF LSM programs to atomically set
xattrs via the inode_init_security hook using lsm_get_xattr_slot().

The inode_init_security hook previously took the xattr array and count
as two separate output parameters (struct xattr *xattrs, int
*xattr_count), which BPF programs cannot write to. Pass the xattr state
as a single context object (struct xattr_ctx) instead, and have
bpf_init_inode_xattr() take that context directly. Update the existing
in-tree callers of inode_init_security to take and forward the new
xattr_ctx.

A previous attempt [1] required a kmalloc string output protocol for
the xattr name. Since commit 6bcdfd2cac55 ("security: Allow all LSMs to
provide xattrs for inode_init_security hook") [2], the xattr name is no
longer allocated; it is a static constant.

Because we rely on the hook-specific ctx layout, the kfunc is
restricted to lsm/inode_init_security. Restrict the xattr names that
may be set via this kfunc to the bpf.* namespace.

Link: https://kernsec.org/pipermail/linux-security-module-archive/2022-October/034878.html [1]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6bcdfd2cac55 [2]
Suggested-by: Song Liu <song@kernel.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
---
 fs/bpf_fs_kfuncs.c                | 106 +++++++++++++++++++++++++++++-
 include/linux/bpf.h               |   1 +
 include/linux/bpf_lsm.h           |   3 +
 include/linux/evm.h               |   9 +--
 include/linux/lsm_hook_defs.h     |   4 +-
 include/linux/lsm_hooks.h         |  16 ++---
 include/linux/security.h          |   5 ++
 kernel/bpf/bpf_lsm.c              |  10 +++
 kernel/bpf/trampoline.c           |   3 +
 security/bpf/hooks.c              |   1 +
 security/integrity/evm/evm_main.c |   8 ++-
 security/security.c               |   7 +-
 security/selinux/hooks.c          |   4 +-
 security/smack/smack_lsm.c        |  27 ++++----
 14 files changed, 166 insertions(+), 38 deletions(-)

diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
index 768aca2dc0f0..7abc3f3d1a67 100644
--- a/fs/bpf_fs_kfuncs.c
+++ b/fs/bpf_fs_kfuncs.c
@@ -10,6 +10,7 @@
 #include <linux/fsnotify.h>
 #include <linux/file.h>
 #include <linux/kernfs.h>
+#include <linux/lsm_hooks.h>
 #include <linux/mm.h>
 #include <linux/xattr.h>
 
@@ -374,6 +375,97 @@ __bpf_kfunc struct inode *bpf_real_inode(struct dentry *dentry)
 	return d_real_inode(dentry);
 }
 
+static int bpf_xattrs_used(const struct xattr_ctx *ctx)
+{
+	const size_t prefix_len = sizeof(XATTR_BPF_LSM_SUFFIX) - 1;
+	int i, n = 0;
+
+	for (i = 0; i < *ctx->xattr_count; i++) {
+		const char *name = ctx->xattrs[i].name;
+
+		if (name && !strncmp(name, XATTR_BPF_LSM_SUFFIX, prefix_len))
+			n++;
+	}
+	return n;
+}
+
+static int __bpf_init_inode_xattr(struct xattr_ctx *xattr_ctx,
+				  const char *name__str,
+				  const struct bpf_dynptr *value_p)
+{
+	struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
+	size_t name_len;
+	void *xattr_value;
+	struct xattr *xattr;
+	struct xattr *xattrs;
+	int *xattr_count;
+	const void *value;
+	u32 value_len;
+
+	if (!xattr_ctx || !name__str)
+		return -EINVAL;
+
+	xattrs = xattr_ctx->xattrs;
+	xattr_count = xattr_ctx->xattr_count;
+	if (!xattrs || !xattr_count)
+		return -EINVAL;
+	if (bpf_xattrs_used(xattr_ctx) >= BPF_LSM_INODE_INIT_XATTRS)
+		return -ENOSPC;
+
+	name_len = strlen(name__str);
+	if (name_len == 0 || name_len > XATTR_NAME_MAX)
+		return -EINVAL;
+	if (strncmp(name__str, XATTR_BPF_LSM_SUFFIX,
+		    sizeof(XATTR_BPF_LSM_SUFFIX) - 1))
+		return -EPERM;
+
+	value_len = __bpf_dynptr_size(value_ptr);
+	if (value_len == 0 || value_len > XATTR_SIZE_MAX)
+		return -EINVAL;
+
+	value = __bpf_dynptr_data(value_ptr, value_len);
+	if (!value)
+		return -EINVAL;
+
+	/* Combine xattr value + name into one allocation. */
+	xattr_value = kmalloc(value_len + name_len + 1, GFP_KERNEL);
+	if (!xattr_value)
+		return -ENOMEM;
+
+	memcpy(xattr_value, value, value_len);
+	memcpy(xattr_value + value_len, name__str, name_len);
+	((char *)xattr_value)[value_len + name_len] = '\0';
+
+	xattr = lsm_get_xattr_slot(xattr_ctx);
+	if (!xattr) {
+		kfree(xattr_value);
+		return -ENOSPC;
+	}
+
+	xattr->value = xattr_value;
+	xattr->name = (const char *)xattr_value + value_len;
+	xattr->value_len = value_len;
+
+	return 0;
+}
+
+/**
+ * bpf_init_inode_xattr - set an xattr on a new inode from inode_init_security
+ * @xattr_ctx: inode_init_security xattr state from the hook context
+ * @name__str: xattr name (e.g., "bpf.file_label")
+ * @value_p: dynptr containing the xattr value
+ *
+ * Only callable from lsm/inode_init_security programs.
+ *
+ * Return: 0 on success, negative error on failure.
+ */
+__bpf_kfunc int bpf_init_inode_xattr(struct xattr_ctx *xattr_ctx,
+				     const char *name__str,
+				     const struct bpf_dynptr *value_p)
+{
+	return __bpf_init_inode_xattr(xattr_ctx, name__str, value_p);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(bpf_fs_kfunc_set_ids)
@@ -385,13 +477,25 @@ BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_set_dentry_xattr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_remove_dentry_xattr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_real_inode, KF_SLEEPABLE | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_init_inode_xattr, KF_SLEEPABLE)
 BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)
 
+BTF_ID_LIST(bpf_lsm_inode_init_security_btf_ids)
+BTF_ID(func, bpf_lsm_inode_init_security)
+
+BTF_ID_LIST(bpf_init_inode_xattr_btf_ids)
+BTF_ID(func, bpf_init_inode_xattr)
+
 static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
 {
 	if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
-	    prog->type == BPF_PROG_TYPE_LSM)
+	    prog->type == BPF_PROG_TYPE_LSM) {
+		/* bpf_init_inode_xattr only attaches to inode_init_security. */
+		if (kfunc_id == bpf_init_inode_xattr_btf_ids[0] &&
+		    prog->aux->attach_btf_id != bpf_lsm_inode_init_security_btf_ids[0])
+			return -EACCES;
 		return 0;
+	}
 	return -EACCES;
 }
 
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7719f6528445..f14bfcda78db 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1752,6 +1752,7 @@ struct bpf_prog_aux {
 	u32 real_func_cnt; /* includes hidden progs, only used for JIT and freeing progs */
 	u32 func_idx; /* 0 for non-func prog, the index in func array for func prog */
 	u32 attach_btf_id; /* in-kernel BTF type id to attach to */
+	u32 attach_limit; /* max concurrent attachments (0 = unlimited) */
 	u32 attach_st_ops_member_off;
 	u32 ctx_arg_info_size;
 	u32 max_rdonly_access;
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index 143775a27a2a..b655c708818e 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -19,6 +19,9 @@
 #include <linux/lsm_hook_defs.h>
 #undef LSM_HOOK
 
+/* max bpf xattrs per inode */
+#define BPF_LSM_INODE_INIT_XATTRS 4
+
 struct bpf_storage_blob {
 	struct bpf_local_storage __rcu *storage;
 };
diff --git a/include/linux/evm.h b/include/linux/evm.h
index 913f4573b203..0aa151288b36 100644
--- a/include/linux/evm.h
+++ b/include/linux/evm.h
@@ -12,6 +12,8 @@
 #include <linux/integrity.h>
 #include <linux/xattr.h>
 
+struct xattr_ctx;
+
 #ifdef CONFIG_EVM
 extern int evm_set_key(void *key, size_t keylen);
 extern enum integrity_status evm_verifyxattr(struct dentry *dentry,
@@ -21,8 +23,8 @@ extern enum integrity_status evm_verifyxattr(struct dentry *dentry,
 int evm_fix_hmac(struct dentry *dentry, const char *xattr_name,
 		 const char *xattr_value, size_t xattr_value_len);
 int evm_inode_init_security(struct inode *inode, struct inode *dir,
-			    const struct qstr *qstr, struct xattr *xattrs,
-			    int *xattr_count);
+			    const struct qstr *qstr,
+			    struct xattr_ctx *xattr_ctx);
 extern bool evm_revalidate_status(const char *xattr_name);
 extern int evm_protected_xattr_if_enabled(const char *req_xattr_name);
 extern int evm_read_protected_xattrs(struct dentry *dentry, u8 *buffer,
@@ -63,8 +65,7 @@ static inline int evm_fix_hmac(struct dentry *dentry, const char *xattr_name,
 
 static inline int evm_inode_init_security(struct inode *inode, struct inode *dir,
 					  const struct qstr *qstr,
-					  struct xattr *xattrs,
-					  int *xattr_count)
+					  struct xattr_ctx *xattr_ctx)
 {
 	return 0;
 }
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 65c9609ec207..f62780fbeb9e 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -116,8 +116,8 @@ LSM_HOOK(int, 0, inode_alloc_security, struct inode *inode)
 LSM_HOOK(void, LSM_RET_VOID, inode_free_security, struct inode *inode)
 LSM_HOOK(void, LSM_RET_VOID, inode_free_security_rcu, void *inode_security)
 LSM_HOOK(int, -EOPNOTSUPP, inode_init_security, struct inode *inode,
-	 struct inode *dir, const struct qstr *qstr, struct xattr *xattrs,
-	 int *xattr_count)
+	 struct inode *dir, const struct qstr *qstr,
+	 struct xattr_ctx *xattr_ctx)
 LSM_HOOK(int, 0, inode_init_security_anon, struct inode *inode,
 	 const struct qstr *name, const struct inode *context_inode)
 LSM_HOOK(int, 0, inode_create, struct inode *dir, struct dentry *dentry,
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index b4f8cad53ddb..710e48caaeba 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -200,20 +200,18 @@ extern struct lsm_static_calls_table static_calls_table __ro_after_init;
 
 /**
  * lsm_get_xattr_slot - Return the next available slot and increment the index
- * @xattrs: array storing LSM-provided xattrs
- * @xattr_count: number of already stored xattrs (updated)
+ * @ctx: xattr state shared by inode_init_security hooks
  *
- * Retrieve the first available slot in the @xattrs array to fill with an xattr,
- * and increment @xattr_count.
+ * Retrieve the first available slot in the @ctx->xattrs array to fill with an
+ * xattr, and increment @ctx->xattr_count.
  *
- * Return: The slot to fill in @xattrs if non-NULL, NULL otherwise.
+ * Return: The slot to fill in @ctx->xattrs if non-NULL, NULL otherwise.
  */
-static inline struct xattr *lsm_get_xattr_slot(struct xattr *xattrs,
-					       int *xattr_count)
+static inline struct xattr *lsm_get_xattr_slot(struct xattr_ctx *ctx)
 {
-	if (unlikely(!xattrs))
+	if (unlikely(!ctx || !ctx->xattrs || !ctx->xattr_count))
 		return NULL;
-	return &xattrs[(*xattr_count)++];
+	return &ctx->xattrs[(*ctx->xattr_count)++];
 }
 
 #endif /* ! __LINUX_LSM_HOOKS_H */
diff --git a/include/linux/security.h b/include/linux/security.h
index 153e9043058f..1f8e84e7dd7e 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -68,6 +68,11 @@ struct watch;
 struct watch_notification;
 struct lsm_ctx;
 
+struct xattr_ctx {
+	struct xattr *xattrs;
+	int *xattr_count;
+};
+
 /* Default (no) options for the capable function */
 #define CAP_OPT_NONE 0x0
 /* If capable should audit the security request */
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 564071a92d7d..86a8e188b900 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -113,6 +113,9 @@ void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
 }
 #endif
 
+BTF_ID_LIST_SINGLE(bpf_lsm_inode_init_security_btf_ids, func,
+		   bpf_lsm_inode_init_security)
+
 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 			const struct bpf_prog *prog)
 {
@@ -137,6 +140,12 @@ int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 		return -EINVAL;
 	}
 
+	/* bpf reserves a fixed number of xattr slots for itself.
+	 * Set the attach limit so the trampoline rejects excess attaches.
+	 */
+	if (btf_id == bpf_lsm_inode_init_security_btf_ids[0])
+		prog->aux->attach_limit = BPF_LSM_INODE_INIT_XATTRS;
+
 	return 0;
 }
 
@@ -315,6 +324,7 @@ BTF_ID(func, bpf_lsm_inode_create)
 BTF_ID(func, bpf_lsm_inode_free_security)
 BTF_ID(func, bpf_lsm_inode_getattr)
 BTF_ID(func, bpf_lsm_inode_getxattr)
+BTF_ID(func, bpf_lsm_inode_init_security)
 BTF_ID(func, bpf_lsm_inode_mknod)
 BTF_ID(func, bpf_lsm_inode_need_killpriv)
 BTF_ID(func, bpf_lsm_inode_post_setxattr)
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 1a721fc4bef5..b41b02173e24 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -859,6 +859,9 @@ static int bpf_trampoline_add_prog(struct bpf_trampoline *tr,
 	}
 	if (cnt >= BPF_MAX_TRAMP_LINKS)
 		return -E2BIG;
+	if (node->link->prog->aux->attach_limit &&
+	    tr->progs_cnt[kind] >= node->link->prog->aux->attach_limit)
+		return -E2BIG;
 	if (!hlist_unhashed(&node->tramp_hlist))
 		/* prog already linked */
 		return -EBUSY;
diff --git a/security/bpf/hooks.c b/security/bpf/hooks.c
index 40efde233f3a..d7c44c5c0e30 100644
--- a/security/bpf/hooks.c
+++ b/security/bpf/hooks.c
@@ -30,6 +30,7 @@ static int __init bpf_lsm_init(void)
 
 struct lsm_blob_sizes bpf_lsm_blob_sizes __ro_after_init = {
 	.lbs_inode = sizeof(struct bpf_storage_blob),
+	.lbs_xattr_count = BPF_LSM_INODE_INIT_XATTRS,
 };
 
 DEFINE_LSM(bpf) = {
diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
index b59e3f121b8a..e0a05162accc 100644
--- a/security/integrity/evm/evm_main.c
+++ b/security/integrity/evm/evm_main.c
@@ -1062,14 +1062,16 @@ static int evm_inode_copy_up_xattr(struct dentry *src, const char *name)
  * evm_inode_init_security - initializes security.evm HMAC value
  */
 int evm_inode_init_security(struct inode *inode, struct inode *dir,
-			    const struct qstr *qstr, struct xattr *xattrs,
-			    int *xattr_count)
+			    const struct qstr *qstr,
+			    struct xattr_ctx *xattr_ctx)
 {
 	struct evm_xattr *xattr_data;
 	struct xattr *xattr, *evm_xattr;
+	struct xattr *xattrs;
 	bool evm_protected_xattrs = false;
 	int rc;
 
+	xattrs = xattr_ctx ? xattr_ctx->xattrs : NULL;
 	if (!(evm_initialized & EVM_INIT_HMAC) || !xattrs)
 		return 0;
 
@@ -1087,7 +1089,7 @@ int evm_inode_init_security(struct inode *inode, struct inode *dir,
 	if (!evm_protected_xattrs)
 		return 0;
 
-	evm_xattr = lsm_get_xattr_slot(xattrs, xattr_count);
+	evm_xattr = lsm_get_xattr_slot(xattr_ctx);
 	/*
 	 * Array terminator (xattr name = NULL) must be the first non-filled
 	 * xattr slot.
diff --git a/security/security.c b/security/security.c
index 71aea8fdf014..8f82a1352356 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1334,6 +1334,7 @@ int security_inode_init_security(struct inode *inode, struct inode *dir,
 {
 	struct lsm_static_call *scall;
 	struct xattr *new_xattrs = NULL;
+	struct xattr_ctx xattr_ctx;
 	int ret = -EOPNOTSUPP, xattr_count = 0;
 
 	if (unlikely(IS_PRIVATE(inode)))
@@ -1349,10 +1350,12 @@ int security_inode_init_security(struct inode *inode, struct inode *dir,
 		if (!new_xattrs)
 			return -ENOMEM;
 	}
+	xattr_ctx.xattrs = new_xattrs;
+	xattr_ctx.xattr_count = &xattr_count;
 
 	lsm_for_each_hook(scall, inode_init_security) {
-		ret = scall->hl->hook.inode_init_security(inode, dir, qstr, new_xattrs,
-						  &xattr_count);
+		ret = scall->hl->hook.inode_init_security(inode, dir, qstr,
+							  &xattr_ctx);
 		if (ret && ret != -EOPNOTSUPP)
 			goto out;
 		/*
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 1a713d96206f..faa8a6b9c45b 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2962,7 +2962,7 @@ static int selinux_dentry_create_files_as(struct dentry *dentry, int mode,
 
 static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
 				       const struct qstr *qstr,
-				       struct xattr *xattrs, int *xattr_count)
+				       struct xattr_ctx *xattr_ctx)
 {
 	const struct cred_security_struct *crsec = selinux_cred(current_cred());
 	struct superblock_security_struct *sbsec;
@@ -2992,7 +2992,7 @@ static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
 	    !(sbsec->flags & SBLABEL_MNT))
 		return -EOPNOTSUPP;
 
-	xattr = lsm_get_xattr_slot(xattrs, xattr_count);
+	xattr = lsm_get_xattr_slot(xattr_ctx);
 	if (xattr) {
 		rc = security_sid_to_context_force(newsid,
 						   &context, &clen);
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index ff115068c5c0..8ed5648a0116 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -981,10 +981,10 @@ smk_rule_transmutes(struct smack_known *subject,
 }
 
 static int
-xattr_dupval(struct xattr *xattrs, int *xattr_count,
+xattr_dupval(struct xattr_ctx *xattr_ctx,
 	     const char *name, const void *value, unsigned int vallen)
 {
-	struct xattr * const xattr = lsm_get_xattr_slot(xattrs, xattr_count);
+	struct xattr * const xattr = lsm_get_xattr_slot(xattr_ctx);
 
 	if (!xattr)
 		return 0;
@@ -1003,14 +1003,13 @@ xattr_dupval(struct xattr *xattrs, int *xattr_count,
  * @inode: the newly created inode
  * @dir: containing directory object
  * @qstr: unused
- * @xattrs: where to put the attributes
- * @xattr_count: current number of LSM-provided xattrs (updated)
+ * @xattr_ctx: where to put attributes and update count
  *
  * Returns 0 if it all works out, -ENOMEM if there's no memory
  */
 static int smack_inode_init_security(struct inode *inode, struct inode *dir,
 				     const struct qstr *qstr,
-				     struct xattr *xattrs, int *xattr_count)
+				     struct xattr_ctx *xattr_ctx)
 {
 	struct task_smack *tsp = smack_cred(current_cred());
 	struct inode_smack * const issp = smack_inode(inode);
@@ -1057,21 +1056,19 @@ static int smack_inode_init_security(struct inode *inode, struct inode *dir,
 		if (S_ISDIR(inode->i_mode)) {
 			transflag = SMK_INODE_TRANSMUTE;
 
-			if (xattr_dupval(xattrs, xattr_count,
-				XATTR_SMACK_TRANSMUTE,
-				TRANS_TRUE,
-				TRANS_TRUE_SIZE
-			))
+			if (xattr_dupval(xattr_ctx,
+					 XATTR_SMACK_TRANSMUTE,
+					 TRANS_TRUE,
+					 TRANS_TRUE_SIZE))
 				rc = -ENOMEM;
 		}
 	}
 
 	if (rc == 0)
-		if (xattr_dupval(xattrs, xattr_count,
-			    XATTR_SMACK_SUFFIX,
-			    issp->smk_inode->smk_known,
-		     strlen(issp->smk_inode->smk_known)
-		))
+		if (xattr_dupval(xattr_ctx,
+				 XATTR_SMACK_SUFFIX,
+				 issp->smk_inode->smk_known,
+				 strlen(issp->smk_inode->smk_known)))
 			rc = -ENOMEM;
 instant_inode:
 	issp->smk_flags |= (SMK_INODE_INSTANT | transflag);
-- 
2.53.0


^ permalink raw reply related

* [PATCH bpf-next v3 0/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: David Windsor @ 2026-06-18 20:34 UTC (permalink / raw)
  To: viro, brauner, jack, ast, daniel, john.fastabend, andrii, eddyz87,
	memxor, martin.lau, song, yonghong.song, jolsa, emil, kpsingh,
	mattbobrowski, paul, jmorris, serge, zohar, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, stephen.smalley.work, omosnace,
	casey, shuah
  Cc: linux-kernel, linux-fsdevel, bpf, linux-security-module,
	linux-integrity, selinux, linux-kselftest, David Windsor

Many in-kernel LSMs (SELinux, Smack, IMA) store security labels in
extended attributes. For these LSMs, atomic labeling during inode
creation is critical: if the inode becomes accessible before its xattr
is set, it is briefly unlabeled, which can disrupt LSMs making policy
decisions based on file labels.

Existing LSMs solve this by setting xattrs directly in the
inode_init_security hook, which runs before the inode becomes
accessible. BPF LSM programs currently lack this capability because
the hook uses an output parameter (xattr_count) that BPF programs
cannot write to, and existing kfuncs like bpf_set_dentry_xattr
require a dentry that isn't available until after the inode is
accessible.

This series introduces the bpf_init_inode_xattr() kfunc, which takes
the combined inode_init_security xattr context argument to access
xattrs and xattr_count, and internally writes to xattr_count via
lsm_get_xattr_slot().

v3:
  - rename struct lsm_xattr_ctx to struct xattr_ctx (Paul)
  - increase BPF_LSM_INODE_INIT_XATTRS to 4 (Song)
  - enforce per-hook attachment cap at attach time to prevent
    runtime rejection (Paul)
  - add init_inode_xattr_attach_cap selftest

v2:
  - pass the xattr state as a combined context object and drop the
    verifier fixup path (Kumar)
  - restrict bpf_init_inode_xattr labels to bpf.* namespace (Matt)
  - cap bpf_init_inode_xattr() at BPF_LSM_INODE_INIT_XATTRS slots per
    invocation (AI)

Link: https://lore.kernel.org/all/20260503211835.16103-1-dwindsor@gmail.com/ [v2]

David Windsor (2):
  bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
  selftests/bpf: add tests for bpf_init_inode_xattr kfunc

 fs/bpf_fs_kfuncs.c                            | 106 +++++++++++++++++-
 include/linux/bpf.h                           |   1 +
 include/linux/bpf_lsm.h                       |   3 +
 include/linux/evm.h                           |   9 +-
 include/linux/lsm_hook_defs.h                 |   4 +-
 include/linux/lsm_hooks.h                     |  16 ++-
 include/linux/security.h                      |   5 +
 kernel/bpf/bpf_lsm.c                          |  10 ++
 kernel/bpf/trampoline.c                       |   3 +
 security/bpf/hooks.c                          |   1 +
 security/integrity/evm/evm_main.c             |   8 +-
 security/security.c                           |   7 +-
 security/selinux/hooks.c                      |   4 +-
 security/smack/smack_lsm.c                    |  27 ++---
 tools/testing/selftests/bpf/bpf_kfuncs.h      |   5 +
 .../selftests/bpf/prog_tests/fs_kfuncs.c      | 105 ++++++++++++++++-
 .../bpf/progs/test_init_inode_xattr.c         |  31 +++++
 17 files changed, 306 insertions(+), 39 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_init_inode_xattr.c


base-commit: e771677c937da5808f7b6c1f0e4a97ec1a84f8a8
-- 
2.53.0


^ permalink raw reply

* Re: Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD
From: Bryam Vargas @ 2026-06-18 20:11 UTC (permalink / raw)
  To: Günther Noack; +Cc: Mickaël Salaün, linux-security-module
In-Reply-To: <ajPhlNSgUcmBoFcM@google.com>

Günther,

Thanks, and #65 looks right.

On the approach: it's a Landlock-only change either way, both hooks already
exist, so no io_uring core churn.

Coarse (block ring creation) can hang off security_uring_allowed(), the existing
io_uring_setup() gate. That matches the creation-control direction Mickaël raised
-- the socket-creation work he said would suit io_uring too -- and it's a fine
default, since most sandboxes don't need io_uring. One caveat: it overlaps
kernel.io_uring_disabled and a seccomp filter on io_uring_setup, so the
Landlock-specific win is mainly composing it in a ruleset.

Fine-grained (gate device uring_cmd) is the only one that closes the asymmetry I
reported. It uses security_uring_cmd() -- the hook SELinux and Smack already have
and we don't -- and needs no new right: gate device files on the existing
IOCTL_DEV, mirroring hook_file_ioctl_common(). All-or-nothing per device, since
cmd_op is a private number space.

So I'd go coarse-first as you suggest, and keep the uring_cmd gate as the granular
step; it's little code and reuses an existing right. Happy to prototype either
once you and Mickaël settle on the shape; I'll hold until then.

Bryam


^ permalink raw reply

* Re: [PATCH v5 7/8] vfs: Replace security_sb_mount/security_move_mount with granular hooks
From: Bryam Vargas @ 2026-06-18 19:33 UTC (permalink / raw)
  To: Song Liu
  Cc: Christian Brauner, Al Viro, Stephen Smalley, Ondrej Mosnacek,
	Mickaël Salaün, John Johansen, Paul Moore, James Morris,
	Serge Hallyn, linux-security-module, linux-fsdevel, linux-kernel

Song,

> +	err = security_mount_change_type(path, ms_flags);

This gates the propagation change on the mount(2) path. The same change on
the newer mount_setattr(2)/open_tree_attr(2) path is left open:
do_mount_setattr() -> mount_setattr_commit() calls change_mnt_propagation()
for the propagation and writes the MNT_NOEXEC/NOSUID/NODEV/READONLY flags --
the same work do_change_type() and do_reconfigure_mnt() do, but with no
hook. security_sb_mount() never reached that path either, so the gap isn't
new. But once this series checks the mount(2) propagation and remount
paths, mount_setattr(2) is the one path left without a check.

It's reachable. A Landlock domain denies mount(2) for the confined task, so
mount(MS_PRIVATE) and a remount clearing noexec both return -EPERM -- but
mount_setattr(propagation=MS_PRIVATE) and
mount_setattr(attr_clr=MOUNT_ATTR_NOEXEC) succeed, and the task then runs a
binary on a mount the policy marked noexec. A SELinux/AppArmor policy that
denies the mount has the same gap. With this series applied,
do_mount_setattr() still carries no security_ call, so the divergence
stands.

Adding the propagation hook and a reconfigure hook in
mount_setattr_commit() would cover mount_setattr too. Happy to send that as
a patch if you want it folded in.

Bryam


^ permalink raw reply

* Re: [PATCH v5 7/8] vfs: Replace security_sb_mount/security_move_mount with granular hooks
From: Christian Brauner @ 2026-06-18 14:02 UTC (permalink / raw)
  To: Song Liu
  Cc: Christian Brauner, linux-security-module, linux-fsdevel, selinux,
	apparmor, paul, jmorris, serge, viro, jack, john.johansen,
	stephen.smalley.work, omosnace, mic, gnoack, takedakn,
	penguin-kernel, herton, kernel-team
In-Reply-To: <CAPhsuW7Wn8GYrsrRhEFXQH5buaP+pdTKc0UV8Mn0B3OnNN-44g@mail.gmail.com>

On 2026-06-18 18:56:42+08:00, Song Liu wrote:
> On Wed, Jun 17, 2026 at 9:53 PM Christian Brauner <brauner@kernel.org> wrote:
> 
> > On Thu, May 28, 2026 at 11:26:06AM -0700, Song Liu wrote:
> 
> [...]
> 
> > >
> >
> > This again is racy as it is called outside of the namespace semaphore:
> >
> >         err = security_mount_bind(&old_path, path, recurse);
> >         if (err)
> >                 return err;
> >
> >         if (mnt_ns_loop(old_path.dentry))
> >                 return -EINVAL;
> >
> >         LOCK_MOUNT(mp, path);
> >         if (IS_ERR(mp.parent))
> >                 return PTR_ERR(mp.parent);
> >
> > After LOCK_MOUNT @path might point to a completely different mount then
> > the one you performed your security checks on.
> 
> I thought we agreed at LSF/MM/BPF 2026 to add the LSM hooks
> before taking namespace semaphore, so that it is possible for LSMs
> to defend against DoS attacks on namespace semaphore? Did I
> miss/misunderstand something?

I think there was a misunderstanding. What I pointed out was that it's a
trade-off. If we do call security hooks under the namespace semaphore or
mount lock than anything that's called under there must take care to not
cause deadlocks - which is especially easy to do with mount lock and
even with the namespace semaphore it may get hairy (automounts etc). The
dos thing is another worry but if an LSM does stupid things we tell it
to not do stupid things and to go away.

But as the hooks are done right now they are meaningless from a security
perspective. You might have a policy that allows mounting on dentry_a
and deny mounting on dentry_b: before LOCK_MOUNT*() you may see dentry_a
and allow the mount but after LOCK_MOUNT*() someone raced you and shoved
a dentry_b mount onto dentry_b and now you allow overmounting dentry_b
which your policy didn't allow -> hosed.

> > Placement of this hook suffers from the same issue as the bind mount
> > hook. Here it's worse because the security layer isn't even informed
> > about MOVE_MOUNT_BENEATH which completely alters the mount relationship.
> 
> Current hook security_move_mount doesn't handle
> MOVE_MOUNT_BENEATH. But we can add mflags to security_mount_move().
> Do we need anything other than mflags?

I think you either need to pass three mounts (source, target, top_mnt)
where for non-mount beneath target == top_mnt or you need two separate
hooks. Because for MOVE_MOUNT_BENEATH you may want to have a tri-part
policy: source, target, top_mnt.


^ permalink raw reply

* Re: Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD
From: Günther Noack @ 2026-06-18 12:16 UTC (permalink / raw)
  To: Bryam Vargas; +Cc: Mickaël Salaün, linux-security-module
In-Reply-To: <20260617230237.14718-1-hexlabsecurity@proton.me>

Hello Bryam!

On Wed, Jun 17, 2026 at 11:02:41PM +0000, Bryam Vargas wrote:
> Thanks Günther, and thanks for filing #64.
> 
> Straight to your two questions:
> 
> 1. Block: you're right. blkdev_uring_cmd() has a single case, BLOCK_URING_CMD_DISCARD,
>    and the blkdev.h note that it's a separate number space is fair, so I'm not arguing
>    it should be a generic ioctl multiplexer. The "others, through other devices" are on
>    NVMe: the namespace char dev takes NVME_URING_CMD_IO / _IO_VEC, and AFAICT a
>    write-capable confined task can reach IO passthrough (write, DSM/discard) with no
>    capability, since nvme_cmd_allowed() only wants FMODE_WRITE there.
> 
>    Correction to my own report: I overstated the ceiling. The NVMe admin ops
>    (format, sanitize, firmware, security-send) sit behind capable(CAP_SYS_ADMIN)
>    in nvme_cmd_allowed(), so a Landlocked unprivileged task can't reach them. The
>    A:H / 8.4 figure was wrong; only namespace IO is in scope for a confined task.
> 
> 2. Truncate: correct, no sidestep, and none looks possible. I went through every
>    f_op->uring_cmd provider (block, NVMe, btrfs encoded I/O, FUSE, ublk, sockets, ...)
>    and none change file size; truncate(2)/ftruncate(2) keep their own hook. Please
>    ignore the "and truncate where relevant" line in my suggested direction, it was
>    speculative.
> 
> On framing: I'm happy to call this a coverage gap rather than a bypass. IOCTL_DEV was
> never documented to cover io_uring, so nothing it promised is broken. The one hard fact
> is the asymmetry: ioctl(2) BLKDISCARD is denied (IOCTL_DEV, and it's not in
> is_masked_device_ioctl()), the same op via uring_cmd isn't, and SELinux/Smack already
> hook security_uring_cmd while Landlock doesn't. Whether that's worth a hook or just the
> doc clarification Mickaël mentioned is your call.

Agreed, a coverage gap is in my mind the right way to think about it.

I filed this issue about that gap:
https://github.com/landlock-lsm/linux/issues/65

Even though that's technically a feature request, you are quite
right pointing it out.

As I'm saying on that issue description as well, there are in principle
multiple ways of blocking such a feature.  It is possible to block it at
the fine-grained layer in uring_cmd, but maybe a more practical way to
go about it would be to block the creation of an io_uring itself, since
most sandboxed processes do not normally make use of that feature.

(IMHO, we have already made a similar mistake in networking, where we
first built restrictions for individual TCP operations, but left all the
other protocols unrestricted.  Maybe the better approach is to start
with the coarser restriction that addresses the majority of use cases
and then provide more granular controls later.)

I'd be interested to hear people's opinions.

(Mickaël, if you feel this is the wrong approach to frame this as
feature request, also please speak up.)


> If you do want one, I can send an RFC for an all-or-nothing "IOCTL_DEV for any uring_cmd
> on a device file" hook (cmd_op is a private number space, so porting
> is_masked_device_ioctl() wouldn't be right). Otherwise I'll drop the provider detail
> into #64 and leave it at the doc fix.

I'd be happy to review your patches for the issue.  But let's find a
consensus on the overall approach first -- that will hopefully also save
you from going to much in circles in the implementation.

Thanks!
—Günther

^ permalink raw reply

* Re: [PATCH v5 7/8] vfs: Replace security_sb_mount/security_move_mount with granular hooks
From: Song Liu @ 2026-06-18 10:56 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-security-module, linux-fsdevel, selinux, apparmor, paul,
	jmorris, serge, viro, jack, john.johansen, stephen.smalley.work,
	omosnace, mic, gnoack, takedakn, penguin-kernel, herton,
	kernel-team
In-Reply-To: <20260617-laufbahn-eifrig-charmant-a48f357a0c52@brauner>

On Wed, Jun 17, 2026 at 9:53 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Thu, May 28, 2026 at 11:26:06AM -0700, Song Liu wrote:
[...]
> >
> > +     err = security_mount_bind(&old_path, path, recurse);
> > +     if (err)
> > +             return err;
>
> This again is racy as it is called outside of the namespace semaphore:
>
>         err = security_mount_bind(&old_path, path, recurse);
>         if (err)
>                 return err;
>
>         if (mnt_ns_loop(old_path.dentry))
>                 return -EINVAL;
>
>         LOCK_MOUNT(mp, path);
>         if (IS_ERR(mp.parent))
>                 return PTR_ERR(mp.parent);
>
> After LOCK_MOUNT @path might point to a completely different mount then
> the one you performed your security checks on.

I thought we agreed at LSF/MM/BPF 2026 to add the LSM hooks
before taking namespace semaphore, so that it is possible for LSMs
to defend against DoS attacks on namespace semaphore? Did I
miss/misunderstand something?

> > +
> >       if (mnt_ns_loop(old_path.dentry))
> >               return -EINVAL;
> >
[...]
> >
> >       err = parse_monolithic_mount_data(fc, data);
> > +     if (!err)
> > +             err = security_mount_remount(fc, path, mnt_flags, flags,
> > +                                         data);
> >       if (!err) {
> >               down_write(&sb->s_umount);
> >               err = -EPERM;
> > @@ -3708,6 +3724,10 @@ static int do_move_mount_old(const struct path *path, const char *old_name)
> >       if (err)
> >               return err;
> >
> > +     err = security_mount_move(&old_path, path);
> > +     if (err)
> > +             return err;
>
> Placement of this hook suffers from the same issue as the bind mount
> hook. Here it's worse because the security layer isn't even informed
> about MOVE_MOUNT_BENEATH which completely alters the mount relationship.

Current hook security_move_mount doesn't handle
MOVE_MOUNT_BENEATH. But we can add mflags to security_mount_move().
Do we need anything other than mflags?

Thanks,
Song

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox