Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [PATCH bpf-next 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: David Windsor @ 2026-04-27  3:23 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, KP Singh,
	Matt Bobrowski, Paul Moore, James Morris, Serge E. Hallyn,
	Song Liu, Jan Kara, John Fastabend, Martin KaFai Lau,
	Yonghong Song, Jiri Olsa, linux-fsdevel, linux-kernel, bpf,
	linux-security-module
In-Reply-To: <CAP01T74iuSDJeL6g=5yLdYGb-VcESK1SYY3R1S1r-CtQEsW+oQ@mail.gmail.com>

On Sun, Apr 26, 2026 at 10:57 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Mon, 27 Apr 2026 at 02:16, David Windsor <dwindsor@gmail.com> wrote:
> >
> > Add bpf_init_inode_xattr() kfunc for BPF LSM programs to atomically set
> > xattrs via inode_init_security hook using lsm_get_xattr_slot().
> >
> > lsm_get_xattr_slot() claims a slot by writing to xattr_count, which BPF
> > programs cannot do: hook arguments are not directly writable from BPF.
> > To hide this, the BPF-facing API is just bpf_init_inode_xattr(name,
> > value), and the verifier transparently rewrites each call into
> > bpf_init_inode_xattr_impl(xattrs, xattr_count, name, value). xattrs and
> > xattr_count are extracted from the hook context, which the verifier
> > spills to the stack at program entry since R1 is clobbered during normal
> > execution.
> >
> > A previous attempt [1] required a kmalloc string output protocol for
> > the xattr name. Since commit 6bcdfd2cac55 ("security: Allow all LSMs to
> > provide xattrs for inode_init_security hook") [2], the xattr name is no
> > longer allocated; it is a static constant. We take advantage of this by
> > passing the name directly. Because we rely on the hook-specific ctx
> > layout, the kfunc is restricted to lsm/inode_init_security.
> >
> > Link: https://kernsec.org/pipermail/linux-security-module-archive/2022-October/034878.html [1]
> > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6bcdfd2cac55 [2]
> > Suggested-by: Song Liu <song@kernel.org>
> > Signed-off-by: David Windsor <dwindsor@gmail.com>
> > ---
>
> The explanation and code make no sense to me. Why not pass xattrs and
> xattr_count directly as arguments, even if you choose to restrict the
> kfunc to a specific hook? Why does the verifier core need the hack to
> spill the context and extract the two arguments?
>

xattr_count is an output parameter; we cannot currently write to it in
bpf as there is no verifier support for writing to int *.  xattrs and
xattr_count can be fixed up by the verifier, so we only require the
user to pass the name and value.

> pw-bot: cr
>
> >  [...]

^ permalink raw reply

* Re: [PATCH bpf-next 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: Kumar Kartikeya Dwivedi @ 2026-04-27  2:56 UTC (permalink / raw)
  To: David Windsor
  Cc: Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, KP Singh,
	Matt Bobrowski, Paul Moore, James Morris, Serge E. Hallyn,
	Song Liu, Jan Kara, John Fastabend, Martin KaFai Lau,
	Yonghong Song, Jiri Olsa, linux-fsdevel, linux-kernel, bpf,
	linux-security-module
In-Reply-To: <20260427001602.38353-2-dwindsor@gmail.com>

On Mon, 27 Apr 2026 at 02:16, David Windsor <dwindsor@gmail.com> wrote:
>
> Add bpf_init_inode_xattr() kfunc for BPF LSM programs to atomically set
> xattrs via inode_init_security hook using lsm_get_xattr_slot().
>
> lsm_get_xattr_slot() claims a slot by writing to xattr_count, which BPF
> programs cannot do: hook arguments are not directly writable from BPF.
> To hide this, the BPF-facing API is just bpf_init_inode_xattr(name,
> value), and the verifier transparently rewrites each call into
> bpf_init_inode_xattr_impl(xattrs, xattr_count, name, value). xattrs and
> xattr_count are extracted from the hook context, which the verifier
> spills to the stack at program entry since R1 is clobbered during normal
> execution.
>
> A previous attempt [1] required a kmalloc string output protocol for
> the xattr name. Since commit 6bcdfd2cac55 ("security: Allow all LSMs to
> provide xattrs for inode_init_security hook") [2], the xattr name is no
> longer allocated; it is a static constant. We take advantage of this by
> passing the name directly. Because we rely on the hook-specific ctx
> layout, the kfunc is restricted to lsm/inode_init_security.
>
> Link: https://kernsec.org/pipermail/linux-security-module-archive/2022-October/034878.html [1]
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6bcdfd2cac55 [2]
> Suggested-by: Song Liu <song@kernel.org>
> Signed-off-by: David Windsor <dwindsor@gmail.com>
> ---

The explanation and code make no sense to me. Why not pass xattrs and
xattr_count directly as arguments, even if you choose to restrict the
kfunc to a specific hook? Why does the verifier core need the hack to
spill the context and extract the two arguments?

pw-bot: cr

>  [...]

^ permalink raw reply

* Re: [PATCH bpf-next 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: bot+bpf-ci @ 2026-04-27  0:51 UTC (permalink / raw)
  To: dwindsor, viro, brauner, ast, daniel, andrii, eddyz87, memxor,
	kpsingh, mattbobrowski, paul, jmorris, serge
  Cc: dwindsor, song, jack, john.fastabend, martin.lau, yonghong.song,
	jolsa, linux-fsdevel, linux-kernel, bpf, linux-security-module,
	ast, andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
	ihor.solodrai
In-Reply-To: <20260427001602.38353-2-dwindsor@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1536 bytes --]

> diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
> --- a/fs/bpf_fs_kfuncs.c
> +++ b/fs/bpf_fs_kfuncs.c

[ ... ]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c

[ ... ]

> @@ -19272,6 +19299,33 @@ struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_2, (long)kptr_struct_meta) };
>  		insn_buf[4] = BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1);
>  		insn_buf[5] = BPF_ALU64_IMM(BPF_NEG, BPF_REG_0, 0);
>  		*cnt = 6;
> +	} else if (env->insn_aux_data[insn_idx].init_inode_xattr_fixup) {
> +		struct bpf_kfunc_desc *impl_desc;
> +
> +		impl_desc = find_kfunc_desc(env->prog,
> +					    special_kfunc_list[KF_bpf_init_inode_xattr_impl], 0);
> +		if (!impl_desc) {
> +			verifier_bug(env, "bpf_init_inode_xattr_impl desc not found");
> +			return -EFAULT;
> +		}
> +
> +		/* Rewrite bpf_init_inode_xattr(name, value) to inject xattrs and
> +		 * xattr_count loaded from the saved inode_init_security ctx.
> +		 */

This isn't a bug, but the comment style needs adjustment. In
kernel/bpf/verifier.c, multi-line comments should have the opening /*
on its own line:

		/*
		 * Rewrite bpf_init_inode_xattr(name, value) to inject xattrs
		 * and xattr_count loaded from the saved inode_init_security
		 * ctx.
		 */

[ ... ]


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/24970971112

^ permalink raw reply

* [PATCH bpf-next 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: David Windsor @ 2026-04-27  0:15 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman,
	Kumar Kartikeya Dwivedi, KP Singh, Matt Bobrowski, Paul Moore,
	James Morris, Serge E. Hallyn
  Cc: David Windsor, Song Liu, Jan Kara, John Fastabend,
	Martin KaFai Lau, Yonghong Song, Jiri Olsa, linux-fsdevel,
	linux-kernel, bpf, linux-security-module
In-Reply-To: <20260427001602.38353-1-dwindsor@gmail.com>

Add bpf_init_inode_xattr() kfunc for BPF LSM programs to atomically set
xattrs via inode_init_security hook using lsm_get_xattr_slot().

lsm_get_xattr_slot() claims a slot by writing to xattr_count, which BPF
programs cannot do: hook arguments are not directly writable from BPF.
To hide this, the BPF-facing API is just bpf_init_inode_xattr(name,
value), and the verifier transparently rewrites each call into
bpf_init_inode_xattr_impl(xattrs, xattr_count, name, value). xattrs and
xattr_count are extracted from the hook context, which the verifier
spills to the stack at program entry since R1 is clobbered during normal
execution.

A previous attempt [1] required a kmalloc string output protocol for
the xattr name. Since commit 6bcdfd2cac55 ("security: Allow all LSMs to
provide xattrs for inode_init_security hook") [2], the xattr name is no
longer allocated; it is a static constant. We take advantage of this by
passing the name directly. Because we rely on the hook-specific ctx
layout, the kfunc is restricted to lsm/inode_init_security.

Link: https://kernsec.org/pipermail/linux-security-module-archive/2022-October/034878.html [1]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6bcdfd2cac55 [2]
Suggested-by: Song Liu <song@kernel.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
---
 fs/bpf_fs_kfuncs.c           | 80 +++++++++++++++++++++++++++++++++++-
 include/linux/bpf_verifier.h |  3 ++
 kernel/bpf/fixups.c          | 20 +++++++++
 kernel/bpf/verifier.c        | 54 ++++++++++++++++++++++++
 security/bpf/hooks.c         |  3 ++
 5 files changed, 159 insertions(+), 1 deletion(-)

diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
index 9d27be058494..5a5951006a3f 100644
--- a/fs/bpf_fs_kfuncs.c
+++ b/fs/bpf_fs_kfuncs.c
@@ -10,6 +10,7 @@
 #include <linux/fsnotify.h>
 #include <linux/file.h>
 #include <linux/kernfs.h>
+#include <linux/lsm_hooks.h>
 #include <linux/mm.h>
 #include <linux/xattr.h>
 
@@ -353,6 +354,68 @@ __bpf_kfunc int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__s
 }
 #endif /* CONFIG_CGROUPS */
 
+/* Called from the verifier fixup of bpf_init_inode_xattr(). */
+__bpf_kfunc int bpf_init_inode_xattr_impl(struct xattr *xattrs, int *xattr_count,
+					  const char *name__str,
+					  const struct bpf_dynptr *value_p)
+{
+	struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
+	size_t name_len;
+	void *xattr_value;
+	struct xattr *xattr;
+	const void *value;
+	u32 value_len;
+
+	if (!xattrs || !xattr_count || !name__str)
+		return -EINVAL;
+
+	name_len = strlen(name__str);
+	if (name_len == 0 || name_len > XATTR_NAME_MAX)
+		return -EINVAL;
+
+	value_len = __bpf_dynptr_size(value_ptr);
+	if (value_len == 0 || value_len > XATTR_SIZE_MAX)
+		return -EINVAL;
+
+	value = __bpf_dynptr_data(value_ptr, value_len);
+	if (!value)
+		return -EINVAL;
+
+	xattr_value = kmemdup(value, value_len, GFP_ATOMIC);
+	if (!xattr_value)
+		return -ENOMEM;
+
+	xattr = lsm_get_xattr_slot(xattrs, xattr_count);
+	if (!xattr) {
+		kfree(xattr_value);
+		return -ENOSPC;
+	}
+
+	xattr->name = name__str;
+	xattr->value = xattr_value;
+	xattr->value_len = value_len;
+
+	return 0;
+}
+
+/**
+ * bpf_init_inode_xattr - set an xattr on a new inode from inode_init_security
+ * @name__str: xattr name (e.g., "bpf.file_label")
+ * @value_p: dynptr containing the xattr value
+ *
+ * Only callable from lsm/inode_init_security programs. The verifier rewrites
+ * calls to bpf_init_inode_xattr_impl() with xattrs/xattr_count extracted from
+ * the hook context.
+ *
+ * Return: 0 on success, negative error on failure.
+ */
+__bpf_kfunc int bpf_init_inode_xattr(const char *name__str,
+				     const struct bpf_dynptr *value_p)
+{
+	WARN_ONCE(1, "%s called without verifier fixup\n", __func__);
+	return -EFAULT;
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(bpf_fs_kfunc_set_ids)
@@ -363,13 +426,28 @@ BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_set_dentry_xattr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_remove_dentry_xattr, KF_SLEEPABLE)
+BTF_ID_FLAGS(func, bpf_init_inode_xattr)
+BTF_ID_FLAGS(func, bpf_init_inode_xattr_impl)
 BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)
 
+BTF_ID_LIST(bpf_lsm_inode_init_security_btf_ids)
+BTF_ID(func, bpf_lsm_inode_init_security)
+
+BTF_ID_LIST(bpf_init_inode_xattr_btf_ids)
+BTF_ID(func, bpf_init_inode_xattr)
+BTF_ID(func, bpf_init_inode_xattr_impl)
+
 static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
 {
 	if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
-	    prog->type == BPF_PROG_TYPE_LSM)
+	    prog->type == BPF_PROG_TYPE_LSM) {
+		/* bpf_init_inode_xattr[_impl] only attach to inode_init_security. */
+		if ((kfunc_id == bpf_init_inode_xattr_btf_ids[0] ||
+		     kfunc_id == bpf_init_inode_xattr_btf_ids[1]) &&
+		    prog->aux->attach_btf_id != bpf_lsm_inode_init_security_btf_ids[0])
+			return -EACCES;
 		return 0;
+	}
 	return -EACCES;
 }
 
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 101ca6cc5424..e73bb2222c3d 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -682,6 +682,7 @@ struct bpf_insn_aux_data {
 	 */
 	u8 fastcall_spills_num:3;
 	u8 arg_prog:4;
+	u8 init_inode_xattr_fixup:1;
 
 	/* below fields are initialized once */
 	unsigned int orig_idx; /* original instruction index */
@@ -903,6 +904,8 @@ struct bpf_verifier_env {
 	bool bypass_spec_v4;
 	bool seen_direct_write;
 	bool seen_exception;
+	bool needs_ctx_spill;
+	s16 ctx_stack_off;
 	struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
 	const struct bpf_line_info *prev_linfo;
 	struct bpf_verifier_log log;
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index fba9e8c00878..18d612a9fe29 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -725,6 +725,26 @@ int bpf_convert_ctx_accesses(struct bpf_verifier_env *env)
 		}
 	}
 
+	if (env->needs_ctx_spill) {
+		if (epilogue_cnt) {
+			/* gen_epilogue already saved ctx to the stack */
+			env->ctx_stack_off = -(s16)subprogs[0].stack_depth;
+		} else {
+			cnt = 0;
+			subprogs[0].stack_depth += 8;
+			env->ctx_stack_off = -(s16)subprogs[0].stack_depth;
+			insn_buf[cnt++] = BPF_STX_MEM(BPF_DW, BPF_REG_FP,
+						      BPF_REG_1,
+						      env->ctx_stack_off);
+			insn_buf[cnt++] = env->prog->insnsi[0];
+			new_prog = bpf_patch_insn_data(env, 0, insn_buf, cnt);
+			if (!new_prog)
+				return -ENOMEM;
+			env->prog = new_prog;
+			delta += cnt - 1;
+		}
+	}
+
 	if (ops->gen_prologue || env->seen_direct_write) {
 		if (!ops->gen_prologue) {
 			verifier_bug(env, "gen_prologue is null");
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 03f9e16c2abe..af5753ffb16b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10794,6 +10794,8 @@ enum special_kfunc_type {
 	KF_bpf_arena_alloc_pages,
 	KF_bpf_arena_free_pages,
 	KF_bpf_arena_reserve_pages,
+	KF_bpf_init_inode_xattr,
+	KF_bpf_init_inode_xattr_impl,
 	KF_bpf_session_is_return,
 	KF_bpf_stream_vprintk,
 	KF_bpf_stream_print_stack,
@@ -10882,6 +10884,13 @@ BTF_ID(func, bpf_task_work_schedule_resume)
 BTF_ID(func, bpf_arena_alloc_pages)
 BTF_ID(func, bpf_arena_free_pages)
 BTF_ID(func, bpf_arena_reserve_pages)
+#ifdef CONFIG_BPF_LSM
+BTF_ID(func, bpf_init_inode_xattr)
+BTF_ID(func, bpf_init_inode_xattr_impl)
+#else
+BTF_ID_UNUSED
+BTF_ID_UNUSED
+#endif
 BTF_ID(func, bpf_session_is_return)
 BTF_ID(func, bpf_stream_vprintk)
 BTF_ID(func, bpf_stream_print_stack)
@@ -12701,6 +12710,24 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 	if (err < 0)
 		return err;
 
+	if (meta.func_id == special_kfunc_list[KF_bpf_init_inode_xattr_impl]) {
+		verbose(env, "bpf_init_inode_xattr_impl is not callable directly\n");
+		return -EACCES;
+	}
+
+	if (meta.func_id == special_kfunc_list[KF_bpf_init_inode_xattr]) {
+		if (env->cur_state->curframe != 0) {
+			verbose(env, "bpf_init_inode_xattr cannot be called from subprograms\n");
+			return -EINVAL;
+		}
+		env->needs_ctx_spill = true;
+		insn_aux->init_inode_xattr_fixup = true;
+		err = bpf_add_kfunc_call(env,
+					 special_kfunc_list[KF_bpf_init_inode_xattr_impl], 0);
+		if (err < 0)
+			return err;
+	}
+
 	if (is_bpf_rbtree_add_kfunc(meta.func_id)) {
 		err = push_callback_call(env, insn, insn_idx, meta.subprogno,
 					 set_rbtree_add_callback_state);
@@ -19272,6 +19299,33 @@ int bpf_fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		insn_buf[4] = BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1);
 		insn_buf[5] = BPF_ALU64_IMM(BPF_NEG, BPF_REG_0, 0);
 		*cnt = 6;
+	} else if (env->insn_aux_data[insn_idx].init_inode_xattr_fixup) {
+		struct bpf_kfunc_desc *impl_desc;
+
+		impl_desc = find_kfunc_desc(env->prog,
+					    special_kfunc_list[KF_bpf_init_inode_xattr_impl], 0);
+		if (!impl_desc) {
+			verifier_bug(env, "bpf_init_inode_xattr_impl desc not found");
+			return -EFAULT;
+		}
+
+		/* Rewrite bpf_init_inode_xattr(name, value) to inject xattrs and
+		 * xattr_count loaded from the saved inode_init_security ctx.
+		 */
+		insn_buf[0] = BPF_MOV64_REG(BPF_REG_3, BPF_REG_1);
+		insn_buf[1] = BPF_MOV64_REG(BPF_REG_4, BPF_REG_2);
+		insn_buf[2] = BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_FP,
+					  env->ctx_stack_off);
+		insn_buf[3] = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_2,
+					  3 * sizeof(u64));
+		insn_buf[4] = BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_2,
+					  4 * sizeof(u64));
+		insn_buf[5] = *insn;
+		if (!bpf_jit_supports_far_kfunc_call())
+			insn_buf[5].imm = BPF_CALL_IMM(impl_desc->addr);
+		else
+			insn_buf[5].imm = impl_desc->func_id;
+		*cnt = 6;
 	}
 
 	if (env->insn_aux_data[insn_idx].arg_prog) {
diff --git a/security/bpf/hooks.c b/security/bpf/hooks.c
index 40efde233f3a..1e61baa821bd 100644
--- a/security/bpf/hooks.c
+++ b/security/bpf/hooks.c
@@ -28,8 +28,11 @@ static int __init bpf_lsm_init(void)
 	return 0;
 }
 
+#define BPF_LSM_INODE_INIT_XATTRS 1
+
 struct lsm_blob_sizes bpf_lsm_blob_sizes __ro_after_init = {
 	.lbs_inode = sizeof(struct bpf_storage_blob),
+	.lbs_xattr_count = BPF_LSM_INODE_INIT_XATTRS,
 };
 
 DEFINE_LSM(bpf) = {
-- 
2.53.0


^ permalink raw reply related

* [PATCH] selinux: don't reserve xattr slot when we won't fill it
From: David Windsor @ 2026-04-26 23:23 UTC (permalink / raw)
  To: Paul Moore, Stephen Smalley
  Cc: Ondrej Mosnacek, selinux, linux-security-module, linux-kernel

Move lsm_get_xattr_slot() below the SBLABEL_MNT check so we don't leave
a NULL-named slot in the array when returning -EOPNOTSUPP; filesystem
initxattrs() callbacks stop iterating at the first NULL ->name, silently
dropping xattrs installed by later LSMs.

Signed-off-by: David Windsor <dwindsor@gmail.com>
---
 security/selinux/hooks.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 97801966bf32..4ff118a9395f 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2966,7 +2966,7 @@ static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
 {
 	const struct cred_security_struct *crsec = selinux_cred(current_cred());
 	struct superblock_security_struct *sbsec;
-	struct xattr *xattr = lsm_get_xattr_slot(xattrs, xattr_count);
+	struct xattr *xattr;
 	u32 newsid, clen;
 	u16 newsclass;
 	int rc;
@@ -2992,6 +2992,7 @@ static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
 	    !(sbsec->flags & SBLABEL_MNT))
 		return -EOPNOTSUPP;
 
+	xattr = lsm_get_xattr_slot(xattrs, xattr_count);
 	if (xattr) {
 		rc = security_sid_to_context_force(newsid,
 						   &context, &clen);

base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH v2 0/4] Firmware LSM hook
From: Jason Gunthorpe @ 2026-04-26 13:42 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Paul Moore, Roberto Sassu, KP Singh, Matt Bobrowski,
	Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Saeed Mahameed, Itay Avraham, Dave Jiang, Jonathan Cameron, bpf,
	linux-kernel, linux-kselftest, linux-rdma, Chiara Meiohas,
	Maher Sanalla, linux-security-module
In-Reply-To: <20260426103957.GH172828@unreal>

On Sun, Apr 26, 2026 at 01:39:57PM +0300, Leon Romanovsky wrote:
> On Fri, Apr 24, 2026 at 11:19:21AM -0300, Jason Gunthorpe wrote:
> > On Thu, Apr 23, 2026 at 05:09:50PM +0300, Leon Romanovsky wrote:
> > 
> > > > > Leon mentioned that different firmware revisions would have different
> > > > > parameters for a given opcode, and that one would need to inspect
> > > > > those parameters to properly filter the command.  Is that not true, or
> > > > > am I misreading or misunderstanding Leon's comments?
> > > > 
> > > > They are ABI stable, so there will be rules about future changes that
> > > > old software can follow to ignore or reject future things it doesn't
> > > > understand.
> > > 
> > > It is wishful thinking and applicable only to mlx5 devices. No one
> > > promises that other devices follow same ABI rules.
> > 
> > Well, I will definately kick them out of fwctl if they don't.
> 
> It is easy to say but harder to follow. The kernel includes many devices that
> exist only in specific hyperscale environments, where the update cycle is
> tightly controlled. They easily can break FW backward compatibility.

Well Linus's rule applies here, if it doesn't bother anyone it didn't
break..

Jason

^ permalink raw reply

* Re: [PATCH v2 0/4] Firmware LSM hook
From: Leon Romanovsky @ 2026-04-26 10:39 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Paul Moore, Roberto Sassu, KP Singh, Matt Bobrowski,
	Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Saeed Mahameed, Itay Avraham, Dave Jiang, Jonathan Cameron, bpf,
	linux-kernel, linux-kselftest, linux-rdma, Chiara Meiohas,
	Maher Sanalla, linux-security-module
In-Reply-To: <20260424141921.GA3611611@ziepe.ca>

On Fri, Apr 24, 2026 at 11:19:21AM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 23, 2026 at 05:09:50PM +0300, Leon Romanovsky wrote:
> 
> > > > Leon mentioned that different firmware revisions would have different
> > > > parameters for a given opcode, and that one would need to inspect
> > > > those parameters to properly filter the command.  Is that not true, or
> > > > am I misreading or misunderstanding Leon's comments?
> > > 
> > > They are ABI stable, so there will be rules about future changes that
> > > old software can follow to ignore or reject future things it doesn't
> > > understand.
> > 
> > It is wishful thinking and applicable only to mlx5 devices. No one
> > promises that other devices follow same ABI rules.
> 
> Well, I will definately kick them out of fwctl if they don't.

It is easy to say but harder to follow. The kernel includes many devices that
exist only in specific hyperscale environments, where the update cycle is
tightly controlled. They easily can break FW backward compatibility.

Thanks

> 
> Jason

^ permalink raw reply

* Re: [PATCH RFC 3/3] LSM: Reserve use of secmarks
From: Casey Schaufler @ 2026-04-25 19:03 UTC (permalink / raw)
  To: Paul Moore, linux-security-module
  Cc: jmorris, serge, keescook, john.johansen, penguin-kernel,
	stephen.smalley.work, selinux, Casey Schaufler
In-Reply-To: <8fa09781ac340398fb2b914bf29d9dcb@paul-moore.com>

On 4/23/2026 6:19 PM, Paul Moore wrote:
> On Feb 25, 2026 Casey Schaufler <casey@schaufler-ca.com> wrote:
>> Use of "exclusive" LSM hooks are limited to the first LSM registering
>> them. These hooks include those use to process network secmarks.
>> The hooks required to process secmarks are flagged with LSM_FLAG_EXCLUSIVE
>> in their definitions. This includes secid_to_secctx and secctx_to_secid,
>> which are used by netfilter.
>>
>> The various LSMs that use secmarks are updated to detect whether
>> they are providing exclusive hooks, and to eschew the use of secmarks
>> if they are not.
>>
>> SELinux has a peculiar behavior in that it may decide that it
>> must have network controls, but only after policy is loaded.
>> This patch includes a warning should policy capability alwaysnetwork
>> be set when another LSM holds the exclusive hooks. It has been
>> suggested that SELinux would consider this a fatal condition, in
>> which case a panic might be a favored, if draconian, option.
>>
>> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
>> ---
>>  include/linux/lsm_hook_defs.h    | 12 +++++------
>>  include/linux/security.h         |  1 +
>>  security/apparmor/lsm.c          | 24 ++++++++++++++++------
>>  security/security.c              | 15 ++++++++++++++
>>  security/selinux/hooks.c         | 35 ++++++++++++++++++++++++--------
>>  security/selinux/ss/services.c   |  3 +++
>>  security/smack/smack_lsm.c       |  6 ++++--
>>  security/smack/smack_netfilter.c |  6 ++++++
>>  8 files changed, 80 insertions(+), 22 deletions(-)
> ..
>
>> diff --git a/include/linux/security.h b/include/linux/security.h
>> index e3c137a1b30a..638436b9b748 100644
>> --- a/include/linux/security.h
>> +++ b/include/linux/security.h
>> @@ -326,6 +326,7 @@ int unregister_blocking_lsm_notifier(struct notifier_block *nb);
>>  extern int security_init(void);
>>  extern int early_security_init(void);
>>  extern u64 lsm_name_to_attr(const char *name);
>> +extern u32 lsm_secmark_from_skb(struct sk_buff *skb, const u64 lsmid);
>>  
>>  /* Security operations */
>>  int security_binder_set_context_mgr(const struct cred *mgr);
>> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
>> index a87cd60ed206..2ce0d9a73973 100644
>> --- a/security/apparmor/lsm.c
>> +++ b/security/apparmor/lsm.c
>> @@ -1511,9 +1511,11 @@ static int apparmor_socket_shutdown(struct socket *sock, int how)
>>  static int apparmor_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb)
>>  {
>>  	struct aa_sk_ctx *ctx = aa_sock(sk);
>> +	u32 secmark;
>>  	int error;
>>  
>> -	if (!skb->secmark)
>> +	secmark = lsm_secmark_from_skb(skb, LSM_ID_APPARMOR);
>> +	if (!secmark)
>>  		return 0;
>>  
>>  	/*
> ..
>
>> diff --git a/security/security.c b/security/security.c
>> index 25e7cfc96f20..754b16004677 100644
>> --- a/security/security.c
>> +++ b/security/security.c
>> @@ -4509,6 +4509,21 @@ void security_inet_conn_established(struct sock *sk,
>>  }
>>  EXPORT_SYMBOL(security_inet_conn_established);
>>  
>> +/**
>> + * lsm_secmark_from_skb - secid for the specified LSM from the packet
>> + * @skb: the packet
>> + * @lsm: which LSM is asking
>> + *
>> + * If the specified LSM has use of the secmark, return its value.
>> + * Otherwise, return the invalid secid 0.
>> + */
>> +u32 lsm_secmark_from_skb(struct sk_buff *skb, const u64 lsmid)
>> +{
>> +	if (lsmid == lsm_exclusive_hooks)
>> +		return skb->secmark;
>> +	return 0;
>> +}
> Ooof.  Not a fan.  A better way to handle this would be to like I
> mentioned in my comments on patch 2/3: have the LSM detect that it lost
> the LSM_FLAG_EXCLUSIVE battle at callback registration time and do
> whatever adjustments are necessary at init time.  In a number of cases I
> believe that simply not registering the callback will be sufficient.

In no case (AppArmor, Smack, SELinux) will that be sufficient. All
access skb->secmark directly. The hooks that use the secmark also use
other mechanisms (e.g. netlabel) to obtain security attributes.

> If the only way to solve this is via runtime checks like this,
> unfortunately my answer is going to be "no".

OK. I hear you.

> .. and of course the proper way to solve this for secmark is to just
> do the idr/xarray conversion for secmarks so every LSM can have their
> own secmark.  Yes, it does require some work, but to be honest I think
> you've spent more time avoiding that work then it would be to just do
> it.  I'm growing increasing inclined to simply state that this is the
> only solution I'm going to accept.

Thank you for the clarity. If you happen to have a preferred example of
how idr/xarray is used I expect that we can reduce the review cycle of
such an implementation.

... Or, if there's someone out there who's itching to get involved and
eager to tackle a moderately complicated integration effort, I'd happily
outline the pain points.

>
> --
> paul-moore.com

With that I think this RFC has been useful in that the LSM define flag
scheme might be useful in the future, but doesn't address the task at
hand.


^ permalink raw reply

* Re: [RFC PATCH v3 4/4] Revert "firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall"
From: Jarkko Sakkinen @ 2026-04-25 14:19 UTC (permalink / raw)
  To: Jonathan McDowell
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, jgg, sudeep.holla,
	maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, noodles, sebastianene, Yeoreum Yun
In-Reply-To: <2e7b4dc552b45ddf14cc43bc449cbebb4ade0027.1777036497.git.noodles@meta.com>

On Fri, Apr 24, 2026 at 02:24:42PM +0100, Jonathan McDowell wrote:
> From: Yeoreum Yun <yeoreum.yun@arm.com>
> 
> This reverts commit 0e0546eabcd6c19765a8dbf5b5db3723e7b0ea75, which was
> added to address ordering issues with the IMA LSM initialisation where
> the TPM would not be fully ready by the time IMA wanted it. This has
> been resolved within IMA by retrying setup during late_initcall_sync if
> the TPM is not available at first.
> 
> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> Signed-off-by: Jonathan McDowell <noodles@meta.com>
> ---
>  drivers/firmware/arm_ffa/driver.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c
> index f2f94d4d533e..01547c5c0e38 100644
> --- a/drivers/firmware/arm_ffa/driver.c
> +++ b/drivers/firmware/arm_ffa/driver.c
> @@ -2106,7 +2106,7 @@ static int __init ffa_init(void)
>  	kfree(drv_info);
>  	return ret;
>  }
> -rootfs_initcall(ffa_init);
> +module_init(ffa_init);
>  
>  static void __exit ffa_exit(void)
>  {
> -- 
> 2.53.0
> 

LGTM (for both tpm patches).

However, I'll hold on any further comments/tags up until I've sorted 7.1 PRs
(just so that I have full focus).

BR, Jarkko

^ permalink raw reply

* Re: [RFC PATCH v3 2/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Jonathan McDowell @ 2026-04-25  9:10 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
	oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	will, noodles, sebastianene, Yeoreum Yun
In-Reply-To: <43ff6ca37df45ed53061dad46e9d31a5118e5714.camel@linux.ibm.com>

On Fri, Apr 24, 2026 at 04:25:31PM -0400, Mimi Zohar wrote:
>Thanks, Jonathan!
>
>On Fri, 2026-04-24 at 14:24 +0100, Jonathan McDowell wrote:
>> -static int __init init_ima(void)
>> +static int __init init_ima(bool late)
>>  {
>>  	int error;
>>  
>> @@ -1247,10 +1247,26 @@ static int __init init_ima(void)
>>  		return 0;
>>  	}
>>  
>> +	/*
>> +	 * If we found the TPM during our first attempt, or we know there's no
>> +	 * TPM, nothing further to do
>> +	 */
>
>Perhaps it's just me, but the comment wording is a bit off.  Could I change it
>to: If we either found the TPM or knew there's no TPM during our first attempt,
>nothing futher to do.

No objections to that updated wording from me.

>Otherwise the patch looks good.
>
>Mimi
>
>
>> +	if (late && (ima_tpm_chip || !IS_ENABLED(CONFIG_TCG_TPM)))
>> +		return 0;
>> +
>> +	ima_tpm_chip = tpm_default_chip();
>> +	if (!ima_tpm_chip && !late && IS_ENABLED(CONFIG_TCG_TPM)) {
>> +		pr_debug("TPM not available, will try later\n");
>> +		return -EPROBE_DEFER;
>> +	}
>> +
>> +	if (!ima_tpm_chip)
>> +		pr_info("No TPM chip found, activating TPM-bypass!\n");
>> +

J.

-- 
Revd Jonathan McDowell, ULC | Run like hell!

^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Yeoreum Yun @ 2026-04-25  4:53 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Paul Moore, roberto.sassu, Jonathan McDowell,
	linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, jmorris, serge, dmitry.kasatkin,
	eric.snowberg, jarkko, jgg, sudeep.holla, maz, oupton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, noodles,
	sebastianene
In-Reply-To: <014cf39aa8d6a0bcfa1a95c069675977ac67b843.camel@linux.ibm.com>

[...]
> > I understand the need to ensure that the TPM is available, but if it
> > isn't safe to wait to initialize IMA at late_initcall_sync() then it
> > would seem like this is a bad option and we need another mechanism to
> > synchronize IMA with TPM devices.  If it is safe to initalize IMA in
> > late_initcall_sync(), just do that and be done with it.
>
> Within the same initcall level there is no way of ordering the initialization.
> Yeorum attempted to address the ordering issue in commit 0e0546eabcd6
> ("firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall"),
> which is being reverted in this patch set.
>
> Ordering within an initcall level needs to be fixed, but for now retrying at
> late_initcall_sync works for some, hopefully most, cases.

Ordering within an initcall level is not good idea.

Though it came to my mind first long ago,
It also goes against existing mechanisms like deferred probe, and
can cause the driver model’s dependency handling to harden
in the wrong direction. So this I think it wouldn't be an option.

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [PATCH RFC 2/3] LSM: Enforce exclusive hooks
From: Casey Schaufler @ 2026-04-25  0:39 UTC (permalink / raw)
  To: Paul Moore, linux-security-module
  Cc: jmorris, serge, keescook, john.johansen, penguin-kernel,
	stephen.smalley.work, selinux, Casey Schaufler
In-Reply-To: <c101a47f7d262bded2bce52bf5463586@paul-moore.com>

On 4/23/2026 6:19 PM, Paul Moore wrote:
> On Feb 25, 2026 Casey Schaufler <casey@schaufler-ca.com> wrote:
>> If an LSM hook is marked as exclusive via LSM_FLAG_EXCLUSIVE
>> in lsm_hook_defs.h it will not be added to the set of hooks to
>> be executed if an different LSM has already registered an
>> exclusive hook.
>>
>> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
>> ---
>>  include/linux/security.h |  2 ++
>>  security/lsm_init.c      | 66 ++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 68 insertions(+)
>>
>> diff --git a/include/linux/security.h b/include/linux/security.h
>> index 83a646d72f6f..e3c137a1b30a 100644
>> --- a/include/linux/security.h
>> +++ b/include/linux/security.h
>> @@ -2404,4 +2404,6 @@ static inline void security_initramfs_populated(void)
>>  }
>>  #endif /* CONFIG_SECURITY */
>>  
>> +extern u64 lsm_exclusive_hooks;
> We already have the 'lsm_exclusive' variable in lsm_init.c, don't create
> another variable that does the same thing.  If the scope of the existing
> variable isn't what you need, change that.

Reusing lsm_exclusive is probably doable. For the RFC it made more
sense to make them separate.

> Although to be honest, I'm not in love with what you're doing with the
> variable anyway, more on that later in the review.
>
>>  #endif /* ! __LINUX_SECURITY_H */
>> diff --git a/security/lsm_init.c b/security/lsm_init.c
>> index 05bd52e6b1f2..dc3c84387a7e 100644
>> --- a/security/lsm_init.c
>> +++ b/security/lsm_init.c
>> @@ -356,6 +356,70 @@ static int __init lsm_static_call_init(struct security_hook_list *hl)
>>  	return -ENOSPC;
>>  }
>>  
>> +/*
>> + * Hooks that are restricted to use by a single security module.
>> + *
>> + * Secmark hooks have not been converted from secids to lsm_props
>> + * due to space limitations in packet headers.
> If this is a general purpose mechanism for all types of LSM hooks, please
> don't put commentary in here about a single class of hook.

OKey Dokey. 

> If this only reason for doing all of this is for secmark, just fix
> secmark instead of going through all of this trouble.

Secmark is the bellwether for using the mechanism.

>> + * Conversions from a secid to a secctx are restricted to the
>> + * single security module. All cases where there may be multiple
>> + * modules providing the input data have been converted to use
>> + * a lsm_prop instead of a secid.
> Okay, yes, the paragraph above is true, I'm just not sure why it is
> important here?

It's important because it addresses the objection that the secmark
issue might be addressed by converting the secmark to a lsm_prop pointer.
I can certainly make that clearer.

>> + */
>> +struct lsm_exclusive {
>> +	struct lsm_static_call *name;
>> +	char *namestr;
>> +	u32 flags;
>> +};
>> +
>> +static __initdata struct lsm_exclusive lsm_exclusive_set[] = {
>> +#define LSM_HOOK(RET, DEFAULT, FLAGS, NAME, ...)	\
>> +	{ .name = static_calls_table.NAME, .flags = FLAGS, .namestr = "" #NAME "" , },
>> +#include <linux/lsm_hook_defs.h>
>> +#undef LSM_HOOK
>> +};
>> +u64 lsm_exclusive_hooks;
>> +EXPORT_SYMBOL(lsm_exclusive_hooks);
> Unless I missed something, we really shouldn't need to export this, why
> did you need EXPORT_SYMBOL() here?

No, you didn't miss anything. It's a carryover from an early experiment.

>> +/**
>> + * lsm_exclusive_hook_denial - Check if exclusive hook is in use
>> + * @hook: the hook to check
>> + *
>> + * Check if the hook in question is restricted to a single using LSM,
>> + * and if the LSM providing single LSM hooks is defined.
>> + *
>> + * Returns true if the hook is exclusive and they are already provided,
>> + * false otherwise.
>> + */
>> +static bool __init lsm_exclusive_hook_denial(struct security_hook_list *hook)
>> +{
>> +	int i;
>> +
>> +	if (lsm_exclusive_hooks == hook->lsmid->id)
>> +		return false;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(lsm_exclusive_set); i++) {
>> +		if (!(lsm_exclusive_set[i].flags & LSM_FLAG_EXCLUSIVE))
>> +			continue;
> The logic on this looks a bit odd.  What is wrong with something like the
> pseduo code below?

This code isn't just looking to see if the hook is exclusive, it's looking
to see if the hook is exclusive and if the LSM requesting it has been given
the privilege to register exclusive hooks. This may not be the cleanest way
to go about it, so I will reconsider how best to do that.

>     for (i = 0; ARRAY_SIZE(lsm_hooks); i++) {
>       if (lsm_hooks[i] == hook) {
>         if (lsm_hooks[i].flags & LSM_FLAG_EXCLUSIVE)
>           return true;
>         else
>           return false;
>       }
>     }
>
>> +		if (hook->scalls == lsm_exclusive_set[i].name) {
>> +			if (lsm_exclusive_hooks) {
>> +				if (lsm_debug)
>> +					lsm_pr("%s denied for %s.\n",
>> +						lsm_exclusive_set[i].namestr,
>> +						hook->lsmid->name);
> The lsm_pr_dbg() macro exists for this very reason.

Indeed. I missed it here. Will correct.

>> +				return true;
>> +			}
>> +			if (lsm_debug)
>> +				lsm_pr("Exclusive hooks limited to %s.\n",
>> +					hook->lsmid->name);
> Same as above.

Same as above.

>> +			lsm_exclusive_hooks = hook->lsmid->id;
>> +			break;
>> +		}
>> +	}
>> +	return false;
>> +}
>>
>>  /**
>>   * security_add_hooks - Add a LSM's hooks to the LSM framework's hook lists
>>   * @hooks: LSM hooks to add
>> @@ -371,6 +435,8 @@ void __init security_add_hooks(struct security_hook_list *hooks, int count,
>>  
>>  	for (i = 0; i < count; i++) {
>>  		hooks[i].lsmid = lsmid;
>> +		if (lsm_exclusive_hook_denial(&hooks[i]))
>> +			continue;
>>  		if (lsm_static_call_init(&hooks[i]))
>>  			panic("exhausted LSM callback slots with LSM %s\n",
>>  			      lsmid->name);
> I don't think we'd want to simply skip over a hook registration if the
> LSM doesn't have access to an exclusive hook.  At the very least it risks
> unexpected behavior in the LSM and at the worst it prevents the LSM from
> properly enforcing it's security policy.  There is a reason we have the
> panic() call in the existing code if we are not able to register a hook.
>
> The simpliest solution here would be to panic, like we do today.
>
> However, if you want to get fancy and enable LSMs to optionally adjust
> their behavior to cope with the loss of a hook callback (I'm looking at
> your patch 3/3 now), come up with a mechanism to report back to the LSM
> that one or more of the callbacks were not registered.  One quick thought
> would be to return a non-zero error code and add an indicator/bool/flag
> to the security_hook_list that would be set by security_add_hooks().

An LSM can easily check to see if it doesn't get exclusive hooks by looking
at lsm_exclusive_hooks after registration.

	if (lsm_exclusive_hooks != LSM_ID_ME)
		... do something (in)appropriate


^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-24 22:49 UTC (permalink / raw)
  To: Paul Moore
  Cc: Yeoreum Yun, roberto.sassu, Jonathan McDowell,
	linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, jmorris, serge, dmitry.kasatkin,
	eric.snowberg, jarkko, jgg, sudeep.holla, maz, oupton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, noodles,
	sebastianene
In-Reply-To: <CAHC9VhS4JmPmCJrYTdbjxb3TO-uK6cB3Zij-ef=wswGce2BGzg@mail.gmail.com>

On Fri, 2026-04-24 at 18:10 -0400, Paul Moore wrote:
> (I'm assuming you meant initcall and not syscall above, but if you're
> talking about something else, please let me know.)
> 
> Saying that you aren't comfortable moving IMA initialization to
> late-sync is inconsistent with allowing IMA initialization to be
> deferred to late-sync.  Either it is okay to initialize IMA in
> late-sync or it isn't.  You must pick one.

Yes, we're discussing late_initcall and late_initcall_sync.

I prefer to look at it as being pragmatic. I'd rather err on the side of caution
and not move the syscall to late_initcall_sync, than move it.  However, others
have moved the syscall to address the TPM-bypass issue for their environment.

Mimi

^ permalink raw reply

* Re: [PATCH v2 0/4] Firmware LSM hook
From: Jason Gunthorpe @ 2026-04-24 22:13 UTC (permalink / raw)
  To: Paul Moore
  Cc: Leon Romanovsky, Roberto Sassu, KP Singh, Matt Bobrowski,
	Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Saeed Mahameed, Itay Avraham, Dave Jiang, Jonathan Cameron, bpf,
	linux-kernel, linux-kselftest, linux-rdma, Chiara Meiohas,
	Maher Sanalla, linux-security-module
In-Reply-To: <CAHC9VhR++21SD+v4Bb16SQmYHgJYZ0ytQ+BecGPNK+fEOe4G7g@mail.gmail.com>

On Fri, Apr 24, 2026 at 04:59:30PM -0400, Paul Moore wrote:
> >
> > > Most LSMs will want to know who is initiating the firmware request
> > > (the subject), the requested operation/opcode (the action/verb), and
> > > the target of the request (the object, which in this case is likely
> > > the kernel or the device).
> >
> > How is
> >   system_u:object_r:httpd_packet_t:s0
> >
> > A kernel or device?
> 
> It's not.  It's one of two labels on a packet.  I've cautioned you
> about leaning too heavily on the secmark comparison as it falls apart
> in a number of places, this is one of those places.

But I want to label a packet too, you keep going back to it not being
the same thing and I keep repeating that all I want to do is put
labels on FWCTL packets :(

> > It is a label for packet contents. I also want a label for packet contents.
> 
> According to your explanations, my understanding is that you want a
> fwctl RPC operation.  That is not the same as the secmark label
> assigned by an iptables/nftables rule.

I view fwctl as an opaque packet based messaging subsystem. It
communicates a packet to a remote CPU and returns a response packet
back to the userspace.

Trying to have the kernel assign fixed meaning to the content of the
packets inside the kernel is contrary to the entire design of fwctl.

It is like demanding the netstack parse HTTP packets as a precondition
to using LSM. It makes no sense.

Any LSM integration requires a labeling system that is not hard wired
into the built kernel. I don't much care what it is, so long as the
classification and label space are defined by userspace.

You say it is not like secmark, fine, but I see a perfect mirror in
secmark...

> > You can take that view, it is certainly one valid way to look at it.
> >
> > But it is completely impractical.
> 
> Elaborate on that, because from what I can tell that is the valid way
> to look at it from a subject/verb/object perspective.

We cannot have the kernel predefine verb labels.

I'm completely fine with using verb if it can be dynamic and userspace
can tell the kernel what the verbs labels are.

This is the only reason I pointed at secmark, it shows a system that
has both a user controller classifier and dynamic labels that are not
fixed into the built kernel. ie it is flexible.

> > > > The same way secmark cannot pre-identify all the XXX_packet_t's.
> > >
> > > Once again, I think there is a disconnect or misunderstanding, on a
> > > SELinux system using secmark all of the packet types, e.g.
> > > "XXX_packet_t's", *are* pre-defined in the SELinux policy.
> >
> > "Pre-defined" in a text files in user space controlled by the admin.
>
> That's not correct.  It's kinda like saying the NIC driver sources are
> simply "text files in user space controlled by the admin".  

That's very pedantic. I mean to the point I wonder if we are even
speaking the same language.

I said the labels are defined by userspace, you said no, and then
explained that they are defined by userspace going through a bunch of
steps:

> The SELinux secmark labels are defined in the SELinux policy sources
> which must be compiled and loaded into the kernel before they are
> valid on a running system. Policy must be written not only to define
> the secmark labels, but also to define the access control rules
> which govern how those packets are handled by the system.  The
> iptables/nftables command lines simply assign a secmark label to a
> packet; that's important, but only a small part of the total
> equation.

I understand all of this, I am totally fine with it. A package will
install, a distribution will provide, or admin will write these
things, and do all the steps to load them into the kernel. I don't see
any issue with that.

Hardwiring things into the built kernel is a problem that must be
avoided because end users only run the kernel provided by the
distribution. "recompiling the driver" is not an option that is
available.

Jason

^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Paul Moore @ 2026-04-24 22:10 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Yeoreum Yun, roberto.sassu, Jonathan McDowell,
	linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, jmorris, serge, dmitry.kasatkin,
	eric.snowberg, jarkko, jgg, sudeep.holla, maz, oupton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, noodles,
	sebastianene
In-Reply-To: <1f78fc75b2e95239973612a4b6c4cc782960b7ac.camel@linux.ibm.com>

On Fri, Apr 24, 2026 at 6:01 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Fri, 2026-04-24 at 17:08 -0400, Paul Moore wrote:
> > On Fri, Apr 24, 2026 at 4:58 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > On Fri, 2026-04-24 at 16:15 -0400, Paul Moore wrote:
> > > > On Fri, Apr 24, 2026 at 1:57 AM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > > > > On Thu, Apr 23, 2026 at 2:13 PM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > > > > >
> > > > > > > Sounds good. Once the patch is posted, I’ll review it as well.
> > > > > > > Sorry again for the noise, and thanks for your patience ;)
> > > > > >
> > > > > > My apologies for not getting a chance to look at this patchset sooner.
> > > > > >
> > > > > > This seems like an obvious, perhaps even stupid, question, but I have
> > > > > > to ask: if IMA can be properly initialized via late_initcall_sync(),
> > > > > > why not simply do the initialization in late_initcall_sync() and drop
> > > > > > the late_initcall() initialization?
> > > > > >
> > > > > > Does any IMA functionality suffer if initialization waits until
> > > > > > late_initcall_sync()?  If so, it seems non-critical if waiting until
> > > > > > _sync() is acceptable, as it appears in these patches/comments.
> > > > >
> > > > > This is the way first patch did, and here is some discussion for this
> > > > > (Might you have seen, but in case of you missed):
> > > > >   - https://lore.kernel.org/all/a6a0e15286c983d720de227c6827adbe976c5b9b.camel@linux.ibm.com/
> > > >
> > > > Thanks for the pointer.
> > > >
> > > > Unfortunately, my concern remains the same: it's either "safe" to
> > > > initialize IMA at late_initcall_sync() or it isn't.  Attempting to
> > > > initialize IMA twice seems both odd and wrong.
> > >
> > > Agreed.  However, IMA should be initialized as soon as the TPM becomes
> > > available, not delayed.
> > >
> > > In patch 2/4 patch description, Jonathan describes the reasoning:
> > >
> > > Unfortunately some TPM drivers (such as Arm FF-A, or SPI attached TPM
> > > devices) are not reliably available during the initcall_late stage,
> > > resulting in a log error:
> > >
> > >   ima: No TPM chip found, activating TPM-bypass!
> > >
> > > and no measurements into the TPM by IMA. We can avoid this by doing IMA
> > > init in the initcall_late_sync stage, after the drivers have completed
> > > their init + registration.
> > >
> > > Rather than do this everywhere, and needlessly delay the initialisation
> > > of IMA when there is no need to do so, we continue to try to initialise
> > > at the earlier stage, only deferring to the later point if the TPM is
> > > not available yet.
> >
> > Once again, that heavily implies that it is safe to initialize IMA in late-sync.
> >
> > Put another way, what breaks if IMA's initialization is delayed to
> > late-sync?  If the answer is nothing, just move the initialization to
> > late-sync.  However, if something *is* broken and we are just doing
> > this because of TPM delays at boot, this patchset just creates
> > additional problems and we need a different solution.  I can't
> > envision a scenario where it makes sense to attempt initialization
> > twice.
> >
> > > > I understand the need to ensure that the TPM is available, but if it
> > > > isn't safe to wait to initialize IMA at late_initcall_sync() then it
> > > > would seem like this is a bad option and we need another mechanism to
> > > > synchronize IMA with TPM devices.  If it is safe to initalize IMA in
> > > > late_initcall_sync(), just do that and be done with it.
> > >
> > > Within the same initcall level there is no way of ordering the initialization.
> > > Yeorum attempted to address the ordering issue in commit 0e0546eabcd6
> > > ("firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall"),
> > > which is being reverted in this patch set.
> > >
> > > Ordering within an initcall level needs to be fixed, but for now retrying at
> > > late_initcall_sync works for some, hopefully most, cases.
> >
> > That's not a good answer.  Ignoring the TPM issue for a moment, can
> > you confirm that moving IMA's initialization to late-sync is safe?  If
> > not, why is this approach being considered?
>
> I'm not seeing any measurements between late_initcall and late_initcall_sync.
> Based on this very limited testing, I don't feel comfortable to actually move
> the syscall, but can see adding support to allow IMA initialization to be
> deferred.

(I'm assuming you meant initcall and not syscall above, but if you're
talking about something else, please let me know.)

Saying that you aren't comfortable moving IMA initialization to
late-sync is inconsistent with allowing IMA initialization to be
deferred to late-sync.  Either it is okay to initialize IMA in
late-sync or it isn't.  You must pick one.

-- 
paul-moore.com

^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-24 22:00 UTC (permalink / raw)
  To: Paul Moore
  Cc: Yeoreum Yun, roberto.sassu, Jonathan McDowell,
	linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, jmorris, serge, dmitry.kasatkin,
	eric.snowberg, jarkko, jgg, sudeep.holla, maz, oupton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, noodles,
	sebastianene
In-Reply-To: <CAHC9VhTW3=RJ8L91RWXYYA9tFjfSXGN-DMEEwdiD6big9H57Ew@mail.gmail.com>

On Fri, 2026-04-24 at 17:08 -0400, Paul Moore wrote:
> On Fri, Apr 24, 2026 at 4:58 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > On Fri, 2026-04-24 at 16:15 -0400, Paul Moore wrote:
> > > On Fri, Apr 24, 2026 at 1:57 AM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > > > On Thu, Apr 23, 2026 at 2:13 PM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > > > > 
> > > > > > Sounds good. Once the patch is posted, I’ll review it as well.
> > > > > > Sorry again for the noise, and thanks for your patience ;)
> > > > > 
> > > > > My apologies for not getting a chance to look at this patchset sooner.
> > > > > 
> > > > > This seems like an obvious, perhaps even stupid, question, but I have
> > > > > to ask: if IMA can be properly initialized via late_initcall_sync(),
> > > > > why not simply do the initialization in late_initcall_sync() and drop
> > > > > the late_initcall() initialization?
> > > > > 
> > > > > Does any IMA functionality suffer if initialization waits until
> > > > > late_initcall_sync()?  If so, it seems non-critical if waiting until
> > > > > _sync() is acceptable, as it appears in these patches/comments.
> > > > 
> > > > This is the way first patch did, and here is some discussion for this
> > > > (Might you have seen, but in case of you missed):
> > > >   - https://lore.kernel.org/all/a6a0e15286c983d720de227c6827adbe976c5b9b.camel@linux.ibm.com/
> > > 
> > > Thanks for the pointer.
> > > 
> > > Unfortunately, my concern remains the same: it's either "safe" to
> > > initialize IMA at late_initcall_sync() or it isn't.  Attempting to
> > > initialize IMA twice seems both odd and wrong.
> > 
> > Agreed.  However, IMA should be initialized as soon as the TPM becomes
> > available, not delayed.
> > 
> > In patch 2/4 patch description, Jonathan describes the reasoning:
> > 
> > Unfortunately some TPM drivers (such as Arm FF-A, or SPI attached TPM
> > devices) are not reliably available during the initcall_late stage,
> > resulting in a log error:
> > 
> >   ima: No TPM chip found, activating TPM-bypass!
> > 
> > and no measurements into the TPM by IMA. We can avoid this by doing IMA
> > init in the initcall_late_sync stage, after the drivers have completed
> > their init + registration.
> > 
> > Rather than do this everywhere, and needlessly delay the initialisation
> > of IMA when there is no need to do so, we continue to try to initialise
> > at the earlier stage, only deferring to the later point if the TPM is
> > not available yet.
> 
> Once again, that heavily implies that it is safe to initialize IMA in late-sync.
> 
> Put another way, what breaks if IMA's initialization is delayed to
> late-sync?  If the answer is nothing, just move the initialization to
> late-sync.  However, if something *is* broken and we are just doing
> this because of TPM delays at boot, this patchset just creates
> additional problems and we need a different solution.  I can't
> envision a scenario where it makes sense to attempt initialization
> twice.
> 
> > > I understand the need to ensure that the TPM is available, but if it
> > > isn't safe to wait to initialize IMA at late_initcall_sync() then it
> > > would seem like this is a bad option and we need another mechanism to
> > > synchronize IMA with TPM devices.  If it is safe to initalize IMA in
> > > late_initcall_sync(), just do that and be done with it.
> > 
> > Within the same initcall level there is no way of ordering the initialization.
> > Yeorum attempted to address the ordering issue in commit 0e0546eabcd6
> > ("firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall"),
> > which is being reverted in this patch set.
> > 
> > Ordering within an initcall level needs to be fixed, but for now retrying at
> > late_initcall_sync works for some, hopefully most, cases.
> 
> That's not a good answer.  Ignoring the TPM issue for a moment, can
> you confirm that moving IMA's initialization to late-sync is safe?  If
> not, why is this approach being considered?

I'm not seeing any measurements between late_initcall and late_initcall_sync. 
Based on this very limited testing, I don't feel comfortable to actually move
the syscall, but can see adding support to allow IMA initialization to be
deferred.

Mimi

^ permalink raw reply

* Re: [PATCH] ima: Fix sigv3 signature handling for EVM_IMA_XATTR_DIGSIG
From: Stefan Berger @ 2026-04-24 21:24 UTC (permalink / raw)
  To: Kamlesh Kumar, zohar
  Cc: linux-integrity, linux-security-module, linux-kernel,
	Kamlesh Kumar
In-Reply-To: <20260424113946.16561-1-kam@juniper.net>



On 4/24/26 7:39 AM, Kamlesh Kumar wrote:
> ima_get_hash_algo() only recognizes version 2 signatures when the xattr
> type is EVM_IMA_XATTR_DIGSIG. Since sigv3 signatures also use
> EVM_IMA_XATTR_DIGSIG as the xattr type, version 3 must be accepted as
> well to correctly determine the hash algorithm.

Thanks. I tested this with your patch. I can sign now with evmctl 
ima_sign --v3 -a sha512 ... even if sha256 is the IMA default and IMA 
verifies it now. Before I had to use evmctl ima_sign --v3 -a sha256 ...

> 
> Additionally, ima_validate_rule() does not include IMA_SIGV3_REQUIRED in
> the allowed flags bitmask for MODULE_CHECK, KEXEC_KERNEL_CHECK, and
> KEXEC_INITRAMFS_CHECK hook functions. As a result, policy rules with
> "appraise_type=sigv3" are rejected for these functions.

# echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=sigv3" > 
/sys/kernel/security/ima/policy
-bash: echo: write error: Invalid argument

This rule is now accepted with your patch.

> 
> Add version 3 to the accepted versions in ima_get_hash_algo() for
> EVM_IMA_XATTR_DIGSIG, and add IMA_SIGV3_REQUIRED to the allowed flags
> for MODULE_CHECK, KEXEC_KERNEL_CHECK, and KEXEC_INITRAMFS_CHECK in
> ima_validate_rule().
> 
> Signed-off-by: Kamlesh Kumar <kam@juniper.net>

Tested-by: Stefan Berger <stefanb@linux.ibm.com>

> ---
>   security/integrity/ima/ima_appraise.c | 5 +++--
>   security/integrity/ima/ima_policy.c   | 3 ++-
>   2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
> index de963b9f3634..2dd231567710 100644
> --- a/security/integrity/ima/ima_appraise.c
> +++ b/security/integrity/ima/ima_appraise.c
> @@ -195,8 +195,9 @@ enum hash_algo ima_get_hash_algo(const struct evm_ima_xattr_data *xattr_value,
>   		return sig->hash_algo;
>   	case EVM_IMA_XATTR_DIGSIG:
>   		sig = (typeof(sig))xattr_value;
> -		if (sig->version != 2 || xattr_len <= sizeof(*sig)
> -		    || sig->hash_algo >= HASH_ALGO__LAST)
> +		if ((sig->version != 2 && sig->version != 3) ||
> +		    xattr_len <= sizeof(*sig) ||
> +		    sig->hash_algo >= HASH_ALGO__LAST)
>   			return ima_hash_algo;
>   		return sig->hash_algo;
>   	case IMA_XATTR_DIGEST_NG:
> diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
> index f7f940a76922..b1c010e8eb13 100644
> --- a/security/integrity/ima/ima_policy.c
> +++ b/security/integrity/ima/ima_policy.c
> @@ -1313,7 +1313,8 @@ static bool ima_validate_rule(struct ima_rule_entry *entry)
>   				     IMA_GID | IMA_EGID |
>   				     IMA_FGROUP | IMA_DIGSIG_REQUIRED |
>   				     IMA_PERMIT_DIRECTIO | IMA_MODSIG_ALLOWED |
> -				     IMA_CHECK_BLACKLIST | IMA_VALIDATE_ALGOS))
> +				     IMA_CHECK_BLACKLIST | IMA_VALIDATE_ALGOS |
> +				     IMA_SIGV3_REQUIRED))
>   			return false;
>   
>   		break;
> 
> base-commit: 82bbd447199ff1441031d2eaf9afe041550cf525


^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Paul Moore @ 2026-04-24 21:08 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Yeoreum Yun, roberto.sassu, Jonathan McDowell,
	linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, jmorris, serge, dmitry.kasatkin,
	eric.snowberg, jarkko, jgg, sudeep.holla, maz, oupton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, noodles,
	sebastianene
In-Reply-To: <014cf39aa8d6a0bcfa1a95c069675977ac67b843.camel@linux.ibm.com>

On Fri, Apr 24, 2026 at 4:58 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Fri, 2026-04-24 at 16:15 -0400, Paul Moore wrote:
> > On Fri, Apr 24, 2026 at 1:57 AM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > > On Thu, Apr 23, 2026 at 2:13 PM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > > >
> > > > > Sounds good. Once the patch is posted, I’ll review it as well.
> > > > > Sorry again for the noise, and thanks for your patience ;)
> > > >
> > > > My apologies for not getting a chance to look at this patchset sooner.
> > > >
> > > > This seems like an obvious, perhaps even stupid, question, but I have
> > > > to ask: if IMA can be properly initialized via late_initcall_sync(),
> > > > why not simply do the initialization in late_initcall_sync() and drop
> > > > the late_initcall() initialization?
> > > >
> > > > Does any IMA functionality suffer if initialization waits until
> > > > late_initcall_sync()?  If so, it seems non-critical if waiting until
> > > > _sync() is acceptable, as it appears in these patches/comments.
> > >
> > > This is the way first patch did, and here is some discussion for this
> > > (Might you have seen, but in case of you missed):
> > >   - https://lore.kernel.org/all/a6a0e15286c983d720de227c6827adbe976c5b9b.camel@linux.ibm.com/
> >
> > Thanks for the pointer.
> >
> > Unfortunately, my concern remains the same: it's either "safe" to
> > initialize IMA at late_initcall_sync() or it isn't.  Attempting to
> > initialize IMA twice seems both odd and wrong.
>
> Agreed.  However, IMA should be initialized as soon as the TPM becomes
> available, not delayed.
>
> In patch 2/4 patch description, Jonathan describes the reasoning:
>
> Unfortunately some TPM drivers (such as Arm FF-A, or SPI attached TPM
> devices) are not reliably available during the initcall_late stage,
> resulting in a log error:
>
>   ima: No TPM chip found, activating TPM-bypass!
>
> and no measurements into the TPM by IMA. We can avoid this by doing IMA
> init in the initcall_late_sync stage, after the drivers have completed
> their init + registration.
>
> Rather than do this everywhere, and needlessly delay the initialisation
> of IMA when there is no need to do so, we continue to try to initialise
> at the earlier stage, only deferring to the later point if the TPM is
> not available yet.

Once again, that heavily implies that it is safe to initialize IMA in late-sync.

Put another way, what breaks if IMA's initialization is delayed to
late-sync?  If the answer is nothing, just move the initialization to
late-sync.  However, if something *is* broken and we are just doing
this because of TPM delays at boot, this patchset just creates
additional problems and we need a different solution.  I can't
envision a scenario where it makes sense to attempt initialization
twice.

> > I understand the need to ensure that the TPM is available, but if it
> > isn't safe to wait to initialize IMA at late_initcall_sync() then it
> > would seem like this is a bad option and we need another mechanism to
> > synchronize IMA with TPM devices.  If it is safe to initalize IMA in
> > late_initcall_sync(), just do that and be done with it.
>
> Within the same initcall level there is no way of ordering the initialization.
> Yeorum attempted to address the ordering issue in commit 0e0546eabcd6
> ("firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall"),
> which is being reverted in this patch set.
>
> Ordering within an initcall level needs to be fixed, but for now retrying at
> late_initcall_sync works for some, hopefully most, cases.

That's not a good answer.  Ignoring the TPM issue for a moment, can
you confirm that moving IMA's initialization to late-sync is safe?  If
not, why is this approach being considered?

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v2 0/4] Firmware LSM hook
From: Paul Moore @ 2026-04-24 20:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Roberto Sassu, KP Singh, Matt Bobrowski,
	Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Saeed Mahameed, Itay Avraham, Dave Jiang, Jonathan Cameron, bpf,
	linux-kernel, linux-kselftest, linux-rdma, Chiara Meiohas,
	Maher Sanalla, linux-security-module
In-Reply-To: <20260424143603.GB3611611@ziepe.ca>

On Fri, Apr 24, 2026 at 10:36 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> On Mon, Apr 20, 2026 at 08:58:09PM -0400, Paul Moore wrote:
> > > > > > The access control point itself represents the requested
> > > > > > operation.  This is possible because the number of networking
> > > > > > operations on a given packet is well defined and fairly limited; at a
> > > > > > high level the packet is either being sent from the node, received by
> > > > > > the node, or is passing through the node.
> > > > >
> > > > > I think we have the same split, fwctl send/recive analog is also very
> > > > > limited.
> > > >
> > > > Sure, but I thought the goal was to enforce access controls on the
> > > > firmware requests based on the opcodes/parameters contained within the
> > > > firmware request blob/mailbox?
> > >
> > > Yes, that's the goal. It is the same as iptables being able to
> > > identify that a send system call has a packet that is http or dns.
> >
> > I think we still have a disconnect here.  A packet being a DNS or HTTP
> > packet is different from an opcode.  The opcode in the iptables isn't
> > "DNS" or "HTTP" it is "INPUT", "OUTPUT", or "FORWARD".
>
> I understand that
>
> > Most LSMs will want to know who is initiating the firmware request
> > (the subject), the requested operation/opcode (the action/verb), and
> > the target of the request (the object, which in this case is likely
> > the kernel or the device).
>
> How is
>   system_u:object_r:httpd_packet_t:s0
>
> A kernel or device?

It's not.  It's one of two labels on a packet.  I've cautioned you
about leaning too heavily on the secmark comparison as it falls apart
in a number of places, this is one of those places.

> It is a label for packet contents. I also want a label for packet contents.

According to your explanations, my understanding is that you want a
fwctl RPC operation.  That is not the same as the secmark label
assigned by an iptables/nftables rule.

> > As I understand things, the action/verb is going to be the opcode
> > within the firmware request.  If you believe I'm wrong about this
> > please help me understand why.
>
> You could make that choice, I'm arguing we should not, and it should
> be in the object side.

Okay, you believe I'm wrong, that's fine, but you need to provide a
(better) explanation for why I'm wrong and your approach is The Right
Way.  Present your case, but please do it without referencing secmark
as that comparison is horribly broken at this point in the discussion.

> > > - op_X_t is the result of the classifier inspecting the RPC
> > >   packet. Admin tells the classifier to return op_X_t similar to
> > >   how --selctx does for iptables.
> >
> > I've tried to explain how this doesn't match with secmark, but I'm
> > evidently doing a poor job.
>
> Yeah, I don't get it at all, sorry. I fell you are making some very
> nuanced distinction with HTTP being an object but the HTTP-equivilant
> in fwctl is not an object, I can't follow it at all.
>
> By that logic:
>
>    iptables -p 80 --string "GET"
>
> Is an action, and it should get a unique action in the tuple.

Let's both do ourselves a favor and drop the secmark comparisons; I
think it is only hurting things at this point.  If we stick with the
secmark analogy I worry we are going to keep repeating the same things
to each other without making any forward progress.

> > If you want to continue with the secmark comparisons it might be
> > helpful to spend some time configuring secmark on a SELinux system,
> > and writing policy for it, to see how it works.
>
> I think I have a pretty good idea, you haven't said anything that
> contradicts what I expect..

Frankly, several comments, including in your last reply, indicate you
don't really grasp secmark, subject/verb/object, SELinux, or some
combination thereof ... and that's okay, you don't really need to
understand those details.  Let's move past the failed secmark analogy
and return to the fwctl hooks, that's the ultimate goal.

> > certain action on an object.  My concern with your example is that the
> > object isn't what is actually being acted upon, it's the requested
> > action.
>
> Object is a label for the packet contents.
>
> > The fwctl ioctl/API allows a user to act on a device, with the
> > actual action being specified by the fwctl payload.  From what I can
> > see, the classifier's output is the action, not the object.
>
> You can take that view, it is certainly one valid way to look at it.
>
> But it is completely impractical.

Elaborate on that, because from what I can tell that is the valid way
to look at it from a subject/verb/object perspective.

> > > The same way secmark cannot pre-identify all the XXX_packet_t's.
> >
> > Once again, I think there is a disconnect or misunderstanding, on a
> > SELinux system using secmark all of the packet types, e.g.
> > "XXX_packet_t's", *are* pre-defined in the SELinux policy.
>
> "Pre-defined" in a text files in user space controlled by the admin.

That's not correct.  It's kinda like saying the NIC driver sources are
simply "text files in user space controlled by the admin".  The
SELinux secmark labels are defined in the SELinux policy sources which
must be compiled and loaded into the kernel before they are valid on a
running system.  Policy must be written not only to define the secmark
labels, but also to define the access control rules which govern how
those packets are handled by the system.  The iptables/nftables
command lines simply assign a secmark label to a packet; that's
important, but only a small part of the total equation.

-- 
paul-moore.com

^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-24 20:57 UTC (permalink / raw)
  To: Paul Moore, Yeoreum Yun, roberto.sassu
  Cc: Jonathan McDowell, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm, jmorris, serge,
	dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
	oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	will, noodles, sebastianene
In-Reply-To: <CAHC9VhSaT_quKYnpFjAfqvL07JNbWMgM6c4pB9F46NHawX3DCA@mail.gmail.com>

On Fri, 2026-04-24 at 16:15 -0400, Paul Moore wrote:
> On Fri, Apr 24, 2026 at 1:57 AM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > On Thu, Apr 23, 2026 at 2:13 PM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > > 
> > > > Sounds good. Once the patch is posted, I’ll review it as well.
> > > > Sorry again for the noise, and thanks for your patience ;)
> > > 
> > > My apologies for not getting a chance to look at this patchset sooner.
> > > 
> > > This seems like an obvious, perhaps even stupid, question, but I have
> > > to ask: if IMA can be properly initialized via late_initcall_sync(),
> > > why not simply do the initialization in late_initcall_sync() and drop
> > > the late_initcall() initialization?
> > > 
> > > Does any IMA functionality suffer if initialization waits until
> > > late_initcall_sync()?  If so, it seems non-critical if waiting until
> > > _sync() is acceptable, as it appears in these patches/comments.
> > 
> > This is the way first patch did, and here is some discussion for this
> > (Might you have seen, but in case of you missed):
> >   - https://lore.kernel.org/all/a6a0e15286c983d720de227c6827adbe976c5b9b.camel@linux.ibm.com/
> 
> Thanks for the pointer.
> 
> Unfortunately, my concern remains the same: it's either "safe" to
> initialize IMA at late_initcall_sync() or it isn't.  Attempting to
> initialize IMA twice seems both odd and wrong.

Agreed.  However, IMA should be initialized as soon as the TPM becomes
available, not delayed.

In patch 2/4 patch description, Jonathan describes the reasoning:

Unfortunately some TPM drivers (such as Arm FF-A, or SPI attached TPM
devices) are not reliably available during the initcall_late stage,
resulting in a log error:

  ima: No TPM chip found, activating TPM-bypass!

and no measurements into the TPM by IMA. We can avoid this by doing IMA
init in the initcall_late_sync stage, after the drivers have completed
their init + registration.

Rather than do this everywhere, and needlessly delay the initialisation
of IMA when there is no need to do so, we continue to try to initialise
at the earlier stage, only deferring to the later point if the TPM is
not available yet.

> 
> I understand the need to ensure that the TPM is available, but if it
> isn't safe to wait to initialize IMA at late_initcall_sync() then it
> would seem like this is a bad option and we need another mechanism to
> synchronize IMA with TPM devices.  If it is safe to initalize IMA in
> late_initcall_sync(), just do that and be done with it.

Within the same initcall level there is no way of ordering the initialization.
Yeorum attempted to address the ordering issue in commit 0e0546eabcd6
("firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall"),
which is being reverted in this patch set.

Ordering within an initcall level needs to be fixed, but for now retrying at
late_initcall_sync works for some, hopefully most, cases.

> 
> I'm also guessing a two stage init process, e.g. some in
> late_initcall() and some in late_initcall_sync(), doesn't make much
> sense here, but that could be one other thing to consider if some IMA
> tasks must be done in late_initcall().

Perhaps something to think about for the future.

Mimi

^ permalink raw reply

* Re: [PATCH RFC 1/3] LSM: add a flags field to the LSM hook definitions
From: Paul Moore @ 2026-04-24 20:29 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: linux-security-module, jmorris, serge, keescook, john.johansen,
	penguin-kernel, stephen.smalley.work, selinux
In-Reply-To: <fadac326-d6cc-4d28-8ebe-b4ed3a06ddd0@schaufler-ca.com>

On Fri, Apr 24, 2026 at 11:24 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 4/23/2026 6:19 PM, Paul Moore wrote:
> > On Feb 25, 2026 Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> Add a field for flags to the definition of LSM hooks.  This allows
> >> for hooks to be identified at system initialization for special
> >> processing.
> >>
> >> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
> >> ---
> >>  include/linux/bpf_lsm.h       |   2 +-
> >>  include/linux/lsm_hook_defs.h | 614 ++++++++++++++++++----------------
> >>  include/linux/lsm_hooks.h     |   4 +-
> >>  kernel/bpf/bpf_lsm.c          |  10 +-
> >>  security/bpf/hooks.c          |   2 +-
> >>  security/security.c           |   6 +-
> >>  6 files changed, 331 insertions(+), 307 deletions(-)
> >>
> >> diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> >> index 643809cc78c3..d71ba8c87e79 100644
> >> --- a/include/linux/bpf_lsm.h
> >> +++ b/include/linux/bpf_lsm.h
> >> @@ -14,7 +14,7 @@
> >>
> >>  #ifdef CONFIG_BPF_LSM
> >>
> >> -#define LSM_HOOK(RET, DEFAULT, NAME, ...) \
> >> +#define LSM_HOOK(RET, DEFAULT, FLAGS, NAME, ...) \
> >>      RET bpf_lsm_##NAME(__VA_ARGS__);
> >>  #include <linux/lsm_hook_defs.h>
> >>  #undef LSM_HOOK
> >> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> >> index 8c42b4bde09c..acda3a02da97 100644
> >> --- a/include/linux/lsm_hook_defs.h
> >> +++ b/include/linux/lsm_hook_defs.h
> >> @@ -18,451 +18,475 @@
> >>   * The macro LSM_HOOK is used to define the data structures required by
> >>   * the LSM framework using the pattern:
> >>   *
> >> - *  LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
> >> + *  LSM_HOOK(<return_type>, <default_value>, <flags>, <single>,
> >> + *           <hook_name>, args...)
> >>   *
> >>   * struct security_hook_heads {
> >> - *   #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
> >> + *   #define LSM_HOOK(RET, DEFAULT, FLAGS, NAME, ...) struct hlist_head NAME;
> >>   *   #include <linux/lsm_hook_defs.h>
> >>   *   #undef LSM_HOOK
> >>   * };
> >>   */
> >> -LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr)
> >> -LSM_HOOK(int, 0, binder_transaction, const struct cred *from,
> >> +LSM_HOOK(int, 0, 0, binder_set_context_mgr, const struct cred *mgr)
> >> +LSM_HOOK(int, 0, 0, binder_transaction, const struct cred *from,
> >>       const struct cred *to)
> > I think adding a flag field to the LSM_HOOK() macro/definitions is a good
> > and useful addition, but I'd prefer if we created a LSM_FLAG_NONE #define
> > and used it here just so we could avoid the back-to-back 0's and do a bit
> > of self-documentation.
>
> I had LSM_FLAG_NONE initially, but removed it when I saw the amount of code
> churn it introduced. I'm happy to put it back.

You're already touching every LSM_HOOK() definition to add a another
parameter, changing that parameter from 0 to LSM_FLAG_NONE shouldn't
make it that much worse.

-- 
paul-moore.com

^ permalink raw reply

* Re: [RFC PATCH v3 2/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-24 20:25 UTC (permalink / raw)
  To: Jonathan McDowell, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm
  Cc: paul, jmorris, serge, roberto.sassu, dmitry.kasatkin,
	eric.snowberg, jarkko, jgg, sudeep.holla, maz, oupton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, noodles,
	sebastianene, Yeoreum Yun
In-Reply-To: <5552c20c6d6d2ae3bbb6b35124af5d98d2f79163.1777036497.git.noodles@meta.com>

Thanks, Jonathan!

On Fri, 2026-04-24 at 14:24 +0100, Jonathan McDowell wrote:
> -static int __init init_ima(void)
> +static int __init init_ima(bool late)
>  {
>  	int error;
>  
> @@ -1247,10 +1247,26 @@ static int __init init_ima(void)
>  		return 0;
>  	}
>  
> +	/*
> +	 * If we found the TPM during our first attempt, or we know there's no
> +	 * TPM, nothing further to do
> +	 */

Perhaps it's just me, but the comment wording is a bit off.  Could I change it
to: If we either found the TPM or knew there's no TPM during our first attempt,
nothing futher to do.

Otherwise the patch looks good.

Mimi


> +	if (late && (ima_tpm_chip || !IS_ENABLED(CONFIG_TCG_TPM)))
> +		return 0;
> +
> +	ima_tpm_chip = tpm_default_chip();
> +	if (!ima_tpm_chip && !late && IS_ENABLED(CONFIG_TCG_TPM)) {
> +		pr_debug("TPM not available, will try later\n");
> +		return -EPROBE_DEFER;
> +	}
> +
> +	if (!ima_tpm_chip)
> +		pr_info("No TPM chip found, activating TPM-bypass!\n");
> +

^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Paul Moore @ 2026-04-24 20:15 UTC (permalink / raw)
  To: Yeoreum Yun, Mimi Zohar, roberto.sassu
  Cc: Jonathan McDowell, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm, jmorris, serge,
	dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
	oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	will, noodles, sebastianene
In-Reply-To: <aesGU8a3mbVzvteH@e129823.arm.com>

On Fri, Apr 24, 2026 at 1:57 AM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > On Thu, Apr 23, 2026 at 2:13 PM Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > >
> > > Sounds good. Once the patch is posted, I’ll review it as well.
> > > Sorry again for the noise, and thanks for your patience ;)
> >
> > My apologies for not getting a chance to look at this patchset sooner.
> >
> > This seems like an obvious, perhaps even stupid, question, but I have
> > to ask: if IMA can be properly initialized via late_initcall_sync(),
> > why not simply do the initialization in late_initcall_sync() and drop
> > the late_initcall() initialization?
> >
> > Does any IMA functionality suffer if initialization waits until
> > late_initcall_sync()?  If so, it seems non-critical if waiting until
> > _sync() is acceptable, as it appears in these patches/comments.
>
> This is the way first patch did, and here is some discussion for this
> (Might you have seen, but in case of you missed):
>   - https://lore.kernel.org/all/a6a0e15286c983d720de227c6827adbe976c5b9b.camel@linux.ibm.com/

Thanks for the pointer.

Unfortunately, my concern remains the same: it's either "safe" to
initialize IMA at late_initcall_sync() or it isn't.  Attempting to
initialize IMA twice seems both odd and wrong.

I understand the need to ensure that the TPM is available, but if it
isn't safe to wait to initialize IMA at late_initcall_sync() then it
would seem like this is a bad option and we need another mechanism to
synchronize IMA with TPM devices.  If it is safe to initalize IMA in
late_initcall_sync(), just do that and be done with it.

I'm also guessing a two stage init process, e.g. some in
late_initcall() and some in late_initcall_sync(), doesn't make much
sense here, but that could be one other thing to consider if some IMA
tasks must be done in late_initcall().

-- 
paul-moore.com

^ permalink raw reply

* Re: [RFC PATCH v1 01/11] security: add LSM blob and hooks for namespaces
From: Paul Moore @ 2026-04-24 19:28 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Serge E . Hallyn,
	Justin Suess, Lennart Poettering, Mikhail Ivanov,
	Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang, kernel-team,
	linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260424.ein5Aiwah6Ai@digikod.net>

On Fri, Apr 24, 2026 at 2:56 PM Mickaël Salaün <mic@digikod.net> wrote:
> On Wed, Apr 22, 2026 at 08:19:59PM -0400, Paul Moore wrote:
> > On Thu, Mar 12, 2026 at 6:05 AM Mickaël Salaün <mic@digikod.net> wrote:
> > >
> > > From: Christian Brauner <brauner@kernel.org>
> > >
> > > All namespace types now share the same ns_common infrastructure. Extend
> > > this to include a security blob so LSMs can start managing namespaces
> > > uniformly without having to add one-off hooks or security fields to
> > > every individual namespace type.
> > >
> > > Add a ns_security pointer to ns_common and the corresponding lbs_ns
> > > blob size to lsm_blob_sizes. Allocation and freeing hooks are called
> > > from the common __ns_common_init() and __ns_common_free() paths so
> > > every namespace type gets covered in one go. All information about the
> > > namespace type and the appropriate casting helpers to get at the
> > > containing namespace are available via ns_common making it
> > > straightforward for LSMs to differentiate when they need to.
> > >
> > > A namespace_install hook is called from validate_ns() during setns(2)
> > > giving LSMs a chance to enforce policy on namespace transitions.
> > >
> > > Individual namespace types can still have their own specialized security
> > > hooks when needed. This is just the common baseline that makes it easy
> > > to track and manage namespaces from the security side without requiring
> > > every namespace type to reinvent the wheel.
> > >
> > > Cc: Günther Noack <gnoack@google.com>
> > > Cc: Paul Moore <paul@paul-moore.com>
> > > Cc: Serge E. Hallyn <serge@hallyn.com>
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > Link: https://lore.kernel.org/r/20260216-work-security-namespace-v1-1-075c28758e1f@kernel.org
> > > ---
> > >  include/linux/lsm_hook_defs.h      |  3 ++
> > >  include/linux/lsm_hooks.h          |  1 +
> > >  include/linux/ns/ns_common_types.h |  3 ++
> > >  include/linux/security.h           | 20 ++++++++
> > >  kernel/nscommon.c                  | 12 +++++
> > >  kernel/nsproxy.c                   |  8 +++-
> > >  security/lsm_init.c                |  2 +
> > >  security/security.c                | 76 ++++++++++++++++++++++++++++++
> > >  8 files changed, 124 insertions(+), 1 deletion(-)

...

> > > diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> > > index 259c4b4f1eeb..f0b30d1907e7 100644
> > > --- a/kernel/nsproxy.c
> > > +++ b/kernel/nsproxy.c
> > > @@ -379,7 +379,13 @@ static int prepare_nsset(unsigned flags, struct nsset *nsset)
> > >
> > >  static inline int validate_ns(struct nsset *nsset, struct ns_common *ns)
> > >  {
> > > -       return ns->ops->install(nsset, ns);
> > > +       int ret;
> > > +
> > > +       ret = ns->ops->install(nsset, ns);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       return security_namespace_install(nsset, ns);
> > >  }
> >
> > Do we also want a security_namespace_switch() called from within
> > switch_task_namespaces()?  Of course LSMs would not be able to fail or
> > return an error at that point, but it seems reasonable that LSMs might
> > want to update LSM state associated with the current task once the
> > namespaces have been changed.  This is similar to all the "_post_" LSM
> > hooks we have for various operations in the VFS and network layers.
>
> What cannot be infered from security_namespace_install()?

We don't actually know if the namespace is attached to a process until
we get to switch_task_namespaces().

Now that I'm looking at this again, why is the
security_namespace_install() call placed after the ns->ops->install()
call?  From an access control perspective we want the LSM hook before
the namespace install, not after as in this patch.  From an LSM state
update perspective we likely only care about when the namespace/nsset
is assigned to a process.

> > I think we would want to pass both the task_struct and whichever
> > nsproxy instance is not stored in the task_struct to the hook.  I
> > prefer placing the hook after the task_struct has been updated, but if
> > anyone feels strongly that it should be the other way that's okay with
> > me.
> >
> > > diff --git a/security/security.c b/security/security.c
> > > index 67af9228c4e9..dcf073cac848 100644
> > > --- a/security/security.c
> > > +++ b/security/security.c
> > > +/**
> > > + * security_namespace_free() - Release LSM security data from a namespace
> > > + * @ns: the namespace being freed
> > > + *
> > > + * Release security data attached to the namespace. Called before the
> > > + * namespace structure is freed.
> > > + *
> > > + * Note: The namespace may be freed via kfree_rcu(). LSMs must use
> > > + * RCU-safe freeing for any data that might be accessed by concurrent
> > > + * RCU readers.
> > > + */
> > > +void security_namespace_free(struct ns_common *ns)
> > > +{
> > > +       if (!ns->ns_security)
> > > +               return;
> > > +
> > > +       call_void_hook(namespace_free, ns);
> > > +
> > > +       kfree(ns->ns_security);
> > > +       ns->ns_security = NULL;
> > > +}
> >
> > The "namespace may be freed via kfree_rcu()" comment in conjunction
> > with the standard kfree() in the function above raises a red flag.  Do
> > we need to take an approach similar to
> > security_inode_free()/inode_free_by_rcu() here?
>
> I though at that too, but the key point is that once the namespace is
> being freed it is not reachable, which is not the case for inode being
> freed (i.e. because of path walks).  So I think the current approach is
> fine even if the comment might be a bit scary.

The comment in the code and your comment in this email thread don't
seem to align very well; one of those needs to be fixed in the next
revision ;)

-- 
paul-moore.com

^ permalink raw reply

* Re: [RFC PATCH v1 01/11] security: add LSM blob and hooks for namespaces
From: Mickaël Salaün @ 2026-04-24 18:56 UTC (permalink / raw)
  To: Paul Moore
  Cc: Christian Brauner, Günther Noack, Serge E . Hallyn,
	Justin Suess, Lennart Poettering, Mikhail Ivanov,
	Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang, kernel-team,
	linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <CAHC9VhT4Cd-FMYhaKgt6Gv-Suu7YJm7YbKZ-vdMwQV7EiWDryA@mail.gmail.com>

On Wed, Apr 22, 2026 at 08:19:59PM -0400, Paul Moore wrote:
> On Thu, Mar 12, 2026 at 6:05 AM Mickaël Salaün <mic@digikod.net> wrote:
> >
> > From: Christian Brauner <brauner@kernel.org>
> >
> > All namespace types now share the same ns_common infrastructure. Extend
> > this to include a security blob so LSMs can start managing namespaces
> > uniformly without having to add one-off hooks or security fields to
> > every individual namespace type.
> >
> > Add a ns_security pointer to ns_common and the corresponding lbs_ns
> > blob size to lsm_blob_sizes. Allocation and freeing hooks are called
> > from the common __ns_common_init() and __ns_common_free() paths so
> > every namespace type gets covered in one go. All information about the
> > namespace type and the appropriate casting helpers to get at the
> > containing namespace are available via ns_common making it
> > straightforward for LSMs to differentiate when they need to.
> >
> > A namespace_install hook is called from validate_ns() during setns(2)
> > giving LSMs a chance to enforce policy on namespace transitions.
> >
> > Individual namespace types can still have their own specialized security
> > hooks when needed. This is just the common baseline that makes it easy
> > to track and manage namespaces from the security side without requiring
> > every namespace type to reinvent the wheel.
> >
> > Cc: Günther Noack <gnoack@google.com>
> > Cc: Paul Moore <paul@paul-moore.com>
> > Cc: Serge E. Hallyn <serge@hallyn.com>
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > Link: https://lore.kernel.org/r/20260216-work-security-namespace-v1-1-075c28758e1f@kernel.org
> > ---
> >  include/linux/lsm_hook_defs.h      |  3 ++
> >  include/linux/lsm_hooks.h          |  1 +
> >  include/linux/ns/ns_common_types.h |  3 ++
> >  include/linux/security.h           | 20 ++++++++
> >  kernel/nscommon.c                  | 12 +++++
> >  kernel/nsproxy.c                   |  8 +++-
> >  security/lsm_init.c                |  2 +
> >  security/security.c                | 76 ++++++++++++++++++++++++++++++
> >  8 files changed, 124 insertions(+), 1 deletion(-)
> 
> ...
> 
> > diff --git a/kernel/nscommon.c b/kernel/nscommon.c
> > index bdc3c86231d3..de774e374f9d 100644
> > --- a/kernel/nscommon.c
> > +++ b/kernel/nscommon.c
> > @@ -77,6 +81,13 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
> >                 ret = proc_alloc_inum(&ns->inum);
> >         if (ret)
> >                 return ret;
> > +
> > +       ret = security_namespace_alloc(ns);
> > +       if (ret) {
> > +               proc_free_inum(ns->inum);
> > +               return ret;
> > +       }
> 
> Since this is an RFC, I'll make the nitpicky comment that it would be
> better if the LSM hook is called security_namespace_init() instead of
> security_namespace_alloc().  This fits better with the convention of
> aligning with the caller's name, as well as to helps to indicate that
> the LSMs will be initializing the LSM state associated with the
> ns_common instance.

Looks good.

> 
> > diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> > index 259c4b4f1eeb..f0b30d1907e7 100644
> > --- a/kernel/nsproxy.c
> > +++ b/kernel/nsproxy.c
> > @@ -379,7 +379,13 @@ static int prepare_nsset(unsigned flags, struct nsset *nsset)
> >
> >  static inline int validate_ns(struct nsset *nsset, struct ns_common *ns)
> >  {
> > -       return ns->ops->install(nsset, ns);
> > +       int ret;
> > +
> > +       ret = ns->ops->install(nsset, ns);
> > +       if (ret)
> > +               return ret;
> > +
> > +       return security_namespace_install(nsset, ns);
> >  }
> 
> Do we also want a security_namespace_switch() called from within
> switch_task_namespaces()?  Of course LSMs would not be able to fail or
> return an error at that point, but it seems reasonable that LSMs might
> want to update LSM state associated with the current task once the
> namespaces have been changed.  This is similar to all the "_post_" LSM
> hooks we have for various operations in the VFS and network layers.

What cannot be infered from security_namespace_install()?

> 
> I think we would want to pass both the task_struct and whichever
> nsproxy instance is not stored in the task_struct to the hook.  I
> prefer placing the hook after the task_struct has been updated, but if
> anyone feels strongly that it should be the other way that's okay with
> me.
> 
> > diff --git a/security/security.c b/security/security.c
> > index 67af9228c4e9..dcf073cac848 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > +/**
> > + * security_namespace_free() - Release LSM security data from a namespace
> > + * @ns: the namespace being freed
> > + *
> > + * Release security data attached to the namespace. Called before the
> > + * namespace structure is freed.
> > + *
> > + * Note: The namespace may be freed via kfree_rcu(). LSMs must use
> > + * RCU-safe freeing for any data that might be accessed by concurrent
> > + * RCU readers.
> > + */
> > +void security_namespace_free(struct ns_common *ns)
> > +{
> > +       if (!ns->ns_security)
> > +               return;
> > +
> > +       call_void_hook(namespace_free, ns);
> > +
> > +       kfree(ns->ns_security);
> > +       ns->ns_security = NULL;
> > +}
> 
> The "namespace may be freed via kfree_rcu()" comment in conjunction
> with the standard kfree() in the function above raises a red flag.  Do
> we need to take an approach similar to
> security_inode_free()/inode_free_by_rcu() here?

I though at that too, but the key point is that once the namespace is
being freed it is not reachable, which is not the case for inode being
freed (i.e. because of path walks).  So I think the current approach is
fine even if the comment might be a bit scary.

> 
> -- 
> paul-moore.com
> 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox