Linux Security Modules development
 help / color / mirror / Atom feed
* Re: [RFC PATCH v1 00/11] Landlock: Namespace and capability control
From: Günther Noack @ 2026-04-22 21:16 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Paul Moore,
	Serge E . Hallyn, Justin Suess, Lennart Poettering,
	Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
	kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260421.aen9Pheishah@digikod.net>

On Tue, Apr 21, 2026 at 10:24:00AM +0200, Mickaël Salaün wrote:
> On Mon, Apr 20, 2026 at 05:06:32PM +0200, Günther Noack wrote:
> > Hello!
> > 
> > On Thu, Mar 12, 2026 at 11:04:33AM +0100, Mickaël Salaün wrote:
> > > Namespaces are a fundamental building block for containers and
> > > application sandboxes, but user namespace creation significantly widens
> > > the kernel attack surface.  CVE-2022-0185 (filesystem mount parsing),
> > > CVE-2022-25636 and CVE-2023-32233 (netfilter), and CVE-2022-0492 (cgroup
> > > v1 release_agent) all demonstrate vulnerabilities exploitable only
> > > through capabilities gained via user namespaces.  Some distributions
> > > block user namespace creation entirely, but this removes a useful
> > > isolation primitive.  Fine-grained control allows trusted programs to
> > > use namespaces while preventing unnecessary exposure for programs that
> > > do not need them.
> > > 
> > > Existing mechanisms (user.max_*_namespaces sysctls, userns_create LSM
> > > hook, PR_SET_NO_NEW_PRIVS, and capset) each address part of this threat
> > > but none provides per-process, fine-grained control over both namespace
> > > types and capabilities.  Container runtimes resort to seccomp-based
> > > clone/unshare filtering, but seccomp cannot dereference clone3's flag
> > > structure, forcing runtimes to block clone3 entirely.
> > > 
> > > Landlock's composable layer model enables several patterns: a user
> > > session manager can restrict namespace types and capabilities broadly
> > > while allowing trusted programs to create the namespaces they need, and
> > > each deeper layer can further restrict the allowed set.  Container
> > > runtimes can similarly deny namespace creation inside managed
> > > containers.
> > 
> > I assume we are talking about an unrestricted systemd user session
> > manager, which would not itself be restricted?  (If the entire user
> > session were running under Landlock, users couldn't change their
> > passwords with "passwd" any more, because of the no_new_privs
> > requirement.)
> 
> systemd can be use to create such session, as other init systems.
> If no_new_privs is set, commands such as passwd would indeed not work,
> but:
> 1. The process applying the Landlock restrictions (e.g. creating the
>    user session) doesn't need to set no_new_privs if it has
>    CAP_SYS_ADMIN in the current user namespace.
> 2. SUID programs can (and should probably) be replaced with proper
>    client/server interfaces (i.e. for the client to not be privileged),
>    see DBus services (e.g. Account) or homectl for instance.

I also think services are a better approach than the suid bit, but
that's to my knowledge not the state of affairs yet (until Lennart
makes it happen, hint hint ;-)).


> > > This series adds two new permission categories to Landlock:
> > > 
> > > - LANDLOCK_PERM_NAMESPACE_ENTER: Restricts which namespace types a
> > >   sandboxed process can acquire: both creation (unshare/clone) and entry
> > >   (setns).  User namespace creation has no capability check in the
> > >   kernel, so this is the only enforcement mechanism for that entry
> > >   point.
> > > 
> > > - LANDLOCK_PERM_CAPABILITY_USE: Restricts which Linux capabilities a
> > >   sandboxed process can use, regardless of how they were obtained
> > >   (including through user namespace creation).
> > 
> > Given that you already went through multiple iterations here, I fully
> 
> It's the first public one, but it's well advanced.
> 
> > expect that I am overlooking something here, but based on the
> > explanation, it's not clear to me why the capability control is needed
> > in addition to the namespace control, to reduce the kernel attack
> > surface.
> > 
> > In my understanding the "attack surface" problem with user namespaces
> > is that they allow unprivileged processes to gain CAP_SYS_ADMIN within
> > that namespace, which unlocks access to code paths which were
> > traditionally reserved for the (top level) root user.
> 
> This capability and others.
> 
> > 
> > But then, to prevent that from happening, it seems that restricting
> > access to user namespace creation would be sufficient?
> 
> It would be sufficient to limit the kernel attack surface, but it would
> make all the related features unusable.  As explained in this cover
> letter, there are already several ways to block everything, but this
> doesn't help for a lot of use cases and this Landlock feature proposes a
> new fine-grained and unprivileged way to properly restrict some
> capabilities.
> 
> > 
> > (Also, in some cases, I suspect it might be possible to break
> > assumptions that more privileged processes make about filesystem
> > layout if the user can change the mount layout.  But that is not an
> > issue with Landlock, as we forbid changes to mounts and also require
> > no_new_privs.)
> > 
> > 
> > > Both use new handled_perm and LANDLOCK_RULE_* constants following the
> > > existing allow-list model.  The UAPI uses raw CAP_* and CLONE_NEW*
> > > values directly; unknown values are silently accepted for forward
> > > compatibility (the allow-list denies them by default).  The Landlock ABI
> > > version is bumped from 8 to 9.
> > 
> > Compatibility question:
> > 
> > For both permission categories, when they are "handled" in the
> > ruleset, they default to denying *all* types of namespaces, and *all*
> > types of capabilities.
> > 
> > This is different to the handled_access_* rights, where we are
> > requiring users to explicitly list all restricted rights as "handled",
> > because the full list of available operations might be a moving
> > target.
> > 
> > Why is this not a problem for capabilities and for namespaces?  Both
> > the list of capabilities and the list of namespaces has been expanded
> > in the past.  What happens if a new capability or namespace is
> > invented?  If these are evolved, is that backwards compatible for the
> > existing users of these Landlock permission categories?
> 
> This question is answered is the documentation (and the commit
> messages), and that's the main difference between handled_access_* and
> handled_perm.  In a nutshell, the permission rules uses non-Landlock
> bits that naturally evolve without any Landlock-specific changes.

I think the deny-by-default is fine given that these namespaces and
capabilities do not exist yet.  It is the case where users add a rule
and we silently ignore unknown bits in the bitfield, which I think
introduces a small problem.  I responded to the documentation commit
with what I believe is a counterexample for the capabilities case.
(Let's discuss it on the documentation patch in the context of the
examples.)


> > > The handled_perm infrastructure is designed to be reusable by future
> > > permission categories.  The last patch documents the design rationale
> > > for the permission model and the criteria for choosing between
> > > handled_access_*, handled_perm, and scoped.  A patch series to add
> > > socket creation control is under review [2]; it could benefit from the
> > > same permission model to achieve complete deny-by-default coverage of
> > > socket creation.
> 
> See here ^
> 
> > > 
> > > This series builds on Christian Brauner's namespace LSM blob RFC [1],
> > > included as patch 1.
> > > 
> > > Christian, could you please review patch 3?  It adds a FOR_EACH_NS_TYPE
> > > X-macro to ns_common_types.h and derives CLONE_NS_ALL, replacing inline
> > > CLONE_NEW* flag enumerations in nsproxy.c and fork.c.
> > > 
> > > Paul, could you please review patch 2?  It adds LSM_AUDIT_DATA_NS, a new
> > > audit record type that logs namespace_type and inum for
> > > namespace-related LSM denials.
> > > 
> > > All four example vulnerabilities follow the same pattern: an
> > > unprivileged user creates a user namespace to obtain capabilities, then
> > > creates a second namespace to exercise them against vulnerable code.
> > > LANDLOCK_PERM_NAMESPACE_ENTER prevents this by denying the user
> > > namespace (eliminating the capability grant) or the specific namespace
> > > type needed to exercise it.  LANDLOCK_PERM_CAPABILITY_USE independently
> > > prevents it by denying the required capability.
> > 
> > Here, it is also not clear to me why LANDLOCK_PERM_CAPABILITY_USE is
> > needed in addition to LANDLOCK_PERM_NAMESPACE_ENTER.
> 
> This is also explained in the documentation.

> > Looking at capabilities(7), my understanding is that capabilities can
> > only be acquired through:
> > 
> > (1) user namespaces (prevented with LANDLOCK_PERM_NAMESPACE_ENTER)
> > (2) execve (setuid or individual capabilities, prevented using
> >     PR_SET_NO_NEW_PRIVS)
> > 
> > ...so if a process were to start out with no such capabilities,
> > wouldn't that be enough to prevent it from gaining more?  Am I
> > overlooking another way through which these can be acquired?
> > 
> > The Landlock capability support adds a "filter" for the use of
> > capabilities, but my understanding of the capability system was that
> > it already *is* that filter.  As long as we prevent the acquisition of
> > new capabilities, shouldn't that be sufficient?
> 
> In a nutshell, capabilities applies to namespaces (and their type), so
> it makes sense to be able to control them together, see the chroot
> example.  Please take a look at the documentation.

I had a hard time puzzling it together in the documentation, but the
chroot example helped.

So, if I am understanding correctly, the idea is that you need it in
order to create a new user namespace, but the restrict the use of
capabilities within that user namespace (not only CAP_SYS_ADMIN, but
also more individual ones).  Sounds reasonable.

I can also see that in order to do that without the Landlock
capability support, the first process within the new namespace would
immediately need to drop capabilities, and that may be outside of the
control of the person defining the Landlock policy..?

–Günther

^ permalink raw reply

* Re: [RFC PATCH v1 10/11] samples/landlock: Add capability and namespace restriction support
From: Günther Noack @ 2026-04-22 21:20 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Paul Moore,
	Serge E . Hallyn, Justin Suess, Lennart Poettering,
	Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
	kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260312100444.2609563-11-mic@digikod.net>

On Thu, Mar 12, 2026 at 11:04:43AM +0100, Mickaël Salaün wrote:
> Extend the sandboxer sample to demonstrate the new Landlock capability
> and namespace restriction features.  The LL_CAPS environment variable
> takes a colon-delimited list of allowed capability numbers (e.g. "18"
> for CAP_SYS_CHROOT).  The LL_NS variable takes a colon-delimited list of
> allowed namespace types by short name (e.g.  "user:uts:net").  Update
> LANDLOCK_ABI_LAST to 9 and add best-effort degradation for older
> kernels.
> 
> Allow creating user and UTS namespaces but deny network namespaces
> (works as an unprivileged user).  All capabilities are available
> (LL_CAPS is not set), but namespace creation is still restricted to the
> types listed in LL_NS.  The first command succeeds because user and UTS
> types are in the allowed set, and sets the hostname inside the new UTS
> namespace.  The second command fails because the network namespace type
> is not allowed by the LANDLOCK_PERM_NAMESPACE_ENTER rule:
> 
>   LL_FS_RO=/ LL_FS_RW=/proc LL_NS="user:uts" \
>     ./sandboxer /bin/sh -c \
>     "unshare --user --uts --map-root-user hostname sandbox \
>     && ! unshare --user --net true"
> 
> Allow only user namespace creation and CAP_SYS_CHROOT (18), denying all
> other capabilities and namespace types (works as an unprivileged user).
> An unprivileged process creates a user namespace (no capability
> required) and calls chroot inside it using the CAP_SYS_CHROOT granted
> within the new namespace:
> 
>   LL_FS_RO=/ LL_FS_RW="" LL_NS="user" LL_CAPS="18" \
>     ./sandboxer /bin/sh -c \
>     "unshare --user --keep-caps chroot / true"
> 
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Günther Noack <gnoack@google.com>
> Cc: Paul Moore <paul@paul-moore.com>
> Cc: Serge E. Hallyn <serge@hallyn.com>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
>  samples/landlock/sandboxer.c | 164 +++++++++++++++++++++++++++++++++--
>  1 file changed, 155 insertions(+), 9 deletions(-)
> 
> diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
> index 9f21088c0855..09c499703835 100644
> --- a/samples/landlock/sandboxer.c
> +++ b/samples/landlock/sandboxer.c
> @@ -14,6 +14,8 @@
>  #include <fcntl.h>
>  #include <linux/landlock.h>
>  #include <linux/socket.h>
> +#include <sched.h>
> +#include <stdbool.h>
>  #include <stddef.h>
>  #include <stdio.h>
>  #include <stdlib.h>
> @@ -22,12 +24,16 @@
>  #include <sys/stat.h>
>  #include <sys/syscall.h>
>  #include <unistd.h>
> -#include <stdbool.h>
>  
>  #if defined(__GLIBC__)
>  #include <linux/prctl.h>
>  #endif
>  
> +/* From include/linux/bits.h, not available in userspace. */
> +#ifndef BITS_PER_TYPE
> +#define BITS_PER_TYPE(type) (sizeof(type) * 8)
> +#endif
> +
>  #ifndef landlock_create_ruleset
>  static inline int
>  landlock_create_ruleset(const struct landlock_ruleset_attr *const attr,
> @@ -60,6 +66,8 @@ static inline int landlock_restrict_self(const int ruleset_fd,
>  #define ENV_FS_RW_NAME "LL_FS_RW"
>  #define ENV_TCP_BIND_NAME "LL_TCP_BIND"
>  #define ENV_TCP_CONNECT_NAME "LL_TCP_CONNECT"
> +#define ENV_CAPS_NAME "LL_CAPS"
> +#define ENV_NS_NAME "LL_NS"
>  #define ENV_SCOPED_NAME "LL_SCOPED"
>  #define ENV_FORCE_LOG_NAME "LL_FORCE_LOG"
>  #define ENV_DELIMITER ":"
> @@ -226,11 +234,125 @@ static int populate_ruleset_net(const char *const env_var, const int ruleset_fd,
>  	return ret;
>  }
>  
> +static __u64 str2ns(const char *const name)
> +{
> +	static const struct {
> +		const char *name;
> +		__u64 value;
> +	} ns_map[] = {
> +		/* clang-format off */
> +		{ "cgroup",	CLONE_NEWCGROUP },
> +		{ "ipc",	CLONE_NEWIPC },
> +		{ "mnt",	CLONE_NEWNS },
> +		{ "net",	CLONE_NEWNET },
> +		{ "pid",	CLONE_NEWPID },
> +		{ "time",	CLONE_NEWTIME },
> +		{ "user",	CLONE_NEWUSER },
> +		{ "uts",	CLONE_NEWUTS },
> +		/* clang-format on */
> +	};
> +	size_t i;
> +
> +	for (i = 0; i < sizeof(ns_map) / sizeof(ns_map[0]); i++) {
> +		if (strcmp(name, ns_map[i].name) == 0)
> +			return ns_map[i].value;
> +	}
> +	return 0;
> +}
> +
> +static int populate_ruleset_caps(const char *const env_var,
> +				 const int ruleset_fd)
> +{
> +	int ret = 1;
> +	char *env_cap_name, *env_cap_name_next, *strcap;
> +	struct landlock_capability_attr cap_attr = {
> +		.allowed_perm = LANDLOCK_PERM_CAPABILITY_USE,
> +	};
> +
> +	env_cap_name = getenv(env_var);
> +	if (!env_cap_name)
> +		return 0;
> +	env_cap_name = strdup(env_cap_name);
> +	unsetenv(env_var);
> +
> +	env_cap_name_next = env_cap_name;
> +	while ((strcap = strsep(&env_cap_name_next, ENV_DELIMITER))) {
> +		__u64 cap;
> +
> +		if (strcmp(strcap, "") == 0)
> +			continue;
> +
> +		if (str2num(strcap, &cap) ||

libcap has cap_from_name(3).  I believe we are linking with libcap
already to drop them before tests.  (I have not used this function
myself yet, but it sounds like it would address this case.)


> +		    cap >= BITS_PER_TYPE(cap_attr.capabilities)) {
> +			fprintf(stderr,
> +				"Failed to parse capability at \"%s\"\n",
> +				strcap);
> +			goto out_free_name;
> +		}
> +		cap_attr.capabilities = 1ULL << cap;
> +		if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_CAPABILITY,
> +				      &cap_attr, 0)) {
> +			fprintf(stderr,
> +				"Failed to update the ruleset with capability \"%llu\": %s\n",
> +				(unsigned long long)cap, strerror(errno));
> +			goto out_free_name;
> +		}
> +	}
> +	ret = 0;
> +
> +out_free_name:
> +	free(env_cap_name);
> +	return ret;
> +}
> +
> +static int populate_ruleset_ns(const char *const env_var, const int ruleset_fd)
> +{
> +	int ret = 1;
> +	char *env_ns_name, *env_ns_name_next, *strns;
> +	struct landlock_namespace_attr ns_attr = {
> +		.allowed_perm = LANDLOCK_PERM_NAMESPACE_ENTER,
> +	};
> +
> +	env_ns_name = getenv(env_var);
> +	if (!env_ns_name)
> +		return 0;
> +	env_ns_name = strdup(env_ns_name);
> +	unsetenv(env_var);
> +
> +	env_ns_name_next = env_ns_name;
> +	while ((strns = strsep(&env_ns_name_next, ENV_DELIMITER))) {
> +		__u64 ns_type;
> +
> +		if (strcmp(strns, "") == 0)
> +			continue;
> +
> +		ns_type = str2ns(strns);
> +		if (!ns_type) {
> +			fprintf(stderr, "Unknown namespace type \"%s\"\n",
> +				strns);
> +			goto out_free_name;
> +		}
> +		ns_attr.namespace_types = ns_type;
> +		if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NAMESPACE,
> +				      &ns_attr, 0)) {
> +			fprintf(stderr,
> +				"Failed to update the ruleset with namespace \"%s\": %s\n",
> +				strns, strerror(errno));
> +			goto out_free_name;
> +		}
> +	}
> +	ret = 0;
> +
> +out_free_name:
> +	free(env_ns_name);
> +	return ret;
> +}
> +
>  /* Returns true on error, false otherwise. */
>  static bool check_ruleset_scope(const char *const env_var,
>  				struct landlock_ruleset_attr *ruleset_attr)
>  {
> -	char *env_type_scope, *env_type_scope_next, *ipc_scoping_name;
> +	char *env_type_scope, *env_type_scope_next, *scope_name;
>  	bool error = false;
>  	bool abstract_scoping = false;
>  	bool signal_scoping = false;
> @@ -247,16 +369,14 @@ static bool check_ruleset_scope(const char *const env_var,
>  
>  	env_type_scope = strdup(env_type_scope);
>  	env_type_scope_next = env_type_scope;
> -	while ((ipc_scoping_name =
> -			strsep(&env_type_scope_next, ENV_DELIMITER))) {
> -		if (strcmp("a", ipc_scoping_name) == 0 && !abstract_scoping) {
> +	while ((scope_name = strsep(&env_type_scope_next, ENV_DELIMITER))) {
> +		if (strcmp("a", scope_name) == 0 && !abstract_scoping) {
>  			abstract_scoping = true;
> -		} else if (strcmp("s", ipc_scoping_name) == 0 &&
> -			   !signal_scoping) {
> +		} else if (strcmp("s", scope_name) == 0 && !signal_scoping) {
>  			signal_scoping = true;
>  		} else {
>  			fprintf(stderr, "Unknown or duplicate scope \"%s\"\n",
> -				ipc_scoping_name);
> +				scope_name);
>  			error = true;
>  			goto out_free_name;
>  		}
> @@ -299,7 +419,7 @@ static bool check_ruleset_scope(const char *const env_var,
>  
>  /* clang-format on */
>  
> -#define LANDLOCK_ABI_LAST 8
> +#define LANDLOCK_ABI_LAST 9
>  
>  #define XSTR(s) #s
>  #define STR(s) XSTR(s)
> @@ -322,6 +442,10 @@ static const char help[] =
>  	"means an empty list):\n"
>  	"* " ENV_TCP_BIND_NAME ": ports allowed to bind (server)\n"
>  	"* " ENV_TCP_CONNECT_NAME ": ports allowed to connect (client)\n"
> +	"* " ENV_CAPS_NAME ": capability numbers allowed to use "
> +	"(e.g. 10 for CAP_NET_BIND_SERVICE, 21 for CAP_SYS_ADMIN)\n"
> +	"* " ENV_NS_NAME ": namespace types allowed to enter "
> +	"(cgroup, ipc, mnt, net, pid, time, user, uts)\n"
>  	"* " ENV_SCOPED_NAME ": actions denied on the outside of the landlock domain\n"
>  	"  - \"a\" to restrict opening abstract unix sockets\n"
>  	"  - \"s\" to restrict sending signals\n"
> @@ -334,6 +458,8 @@ static const char help[] =
>  	ENV_FS_RW_NAME "=\"/dev/null:/dev/full:/dev/zero:/dev/pts:/tmp\" "
>  	ENV_TCP_BIND_NAME "=\"9418\" "
>  	ENV_TCP_CONNECT_NAME "=\"80:443\" "
> +	ENV_CAPS_NAME "=\"21\" "
> +	ENV_NS_NAME "=\"user:uts:net\" "
>  	ENV_SCOPED_NAME "=\"a:s\" "
>  	"%1$s bash -i\n"
>  	"\n"
> @@ -357,6 +483,8 @@ int main(const int argc, char *const argv[], char *const *const envp)
>  				      LANDLOCK_ACCESS_NET_CONNECT_TCP,
>  		.scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
>  			  LANDLOCK_SCOPE_SIGNAL,
> +		.handled_perm = LANDLOCK_PERM_CAPABILITY_USE |
> +				LANDLOCK_PERM_NAMESPACE_ENTER,
>  	};
>  	int supported_restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON;
>  	int set_restrict_flags = 0;
> @@ -438,6 +566,10 @@ int main(const int argc, char *const argv[], char *const *const envp)
>  			~LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON;
>  		__attribute__((fallthrough));
>  	case 7:
> +		__attribute__((fallthrough));
> +	case 8:
> +		/* Removes permission support for ABI < 9 */
> +		ruleset_attr.handled_perm = 0;
>  		/* Must be printed for any ABI < LANDLOCK_ABI_LAST. */
>  		fprintf(stderr,
>  			"Hint: You should update the running kernel "
> @@ -470,6 +602,14 @@ int main(const int argc, char *const argv[], char *const *const envp)
>  			~LANDLOCK_ACCESS_NET_CONNECT_TCP;
>  	}
>  
> +	/* Removes capability handling if not set by a user. */
> +	if (!getenv(ENV_CAPS_NAME))
> +		ruleset_attr.handled_perm &= ~LANDLOCK_PERM_CAPABILITY_USE;
> +
> +	/* Removes namespace handling if not set by a user. */
> +	if (!getenv(ENV_NS_NAME))
> +		ruleset_attr.handled_perm &= ~LANDLOCK_PERM_NAMESPACE_ENTER;
> +
>  	if (check_ruleset_scope(ENV_SCOPED_NAME, &ruleset_attr))
>  		return 1;
>  
> @@ -514,6 +654,12 @@ int main(const int argc, char *const argv[], char *const *const envp)
>  		goto err_close_ruleset;
>  	}
>  
> +	if (populate_ruleset_caps(ENV_CAPS_NAME, ruleset_fd))
> +		goto err_close_ruleset;
> +
> +	if (populate_ruleset_ns(ENV_NS_NAME, ruleset_fd))
> +		goto err_close_ruleset;
> +
>  	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
>  		perror("Failed to restrict privileges");
>  		goto err_close_ruleset;
> -- 
> 2.53.0
> 

^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-22 21:20 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
	oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	will, noodles, sebastianene
In-Reply-To: <aekkVQwueKbFtG7C@e129823.arm.com>

On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > Hi Mimi,
> > 
> > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > the TPM driver must be built as built-in and
> > > > must be probed before the IMA subsystem is initialized.
> > > > 
> > > > However, when the TPM device operates over the FF-A protocol using
> > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > interface to the tpm_crb driver — has not yet been probed.
> > > > 
> > > > To ensure the TPM device operating over the FF-A protocol with
> > > > the CRB interface is probed before IMA initialization,
> > > > the following conditions must be met:
> > > > 
> > > >    1. The corresponding ffa_device must be registered,
> > > >       which is done via ffa_init().
> > > > 
> > > >    2. The tpm_crb_driver must successfully probe this device via
> > > >       tpm_crb_ffa_init().
> > > > 
> > > >    3. The tpm_crb driver using CRB over FF-A can then
> > > >       be probed successfully. (See crb_acpi_add() and
> > > >       tpm_crb_ffa_init() for reference.)
> > > > 
> > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > 
> > > > When this occurs, probing the TPM device is deferred.
> > > > However, the deferred probe can happen after the IMA subsystem
> > > > has already been initialized, since IMA initialization is performed
> > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > at the same level.
> > > > 
> > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > log though TPM device presents in the system.
> > > > 
> > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > 
> > > A lot of change for just detecting whether ima_init() is being called on
> > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > changes (e.g. ima_init_core).
> > > 
> > > Please just limit the change to just calling ima_init() twice.
> > 
> > My concern is that ima_update_policy_flags() will be called
> > when ima_init() is deferred -- not initialised anything.
> > though functionally, it might be okay however,
> > I think ima_update_policy_flags() and notifier should work after ima_init()
> > works logically.
> > 
> > This change I think not much quite a lot. just wrapper ima_init() with
> > ima_init_core() with some error handling.
> > 
> > Am I missing something?
> 
> Also, if we handle in ima_init() only, but it failed with other reason,
> we shouldn't call again ima_init() in the late_initcall_sync.
> 
> To handle this, It wouldn't do in the ima_init() but we need to handle
> it by caller of ima_init().

Only tpm_default_chip() is being called to set the ima_tpm_chip.  On failure,
instead of going into TPM-bypass mode, return immediately.  There are no calls
to anything else.  Just call ima_init() a second time.

Mimi



^ permalink raw reply

* Re: [RFC PATCH v1 01/11] security: add LSM blob and hooks for namespaces
From: Günther Noack @ 2026-04-22 21:21 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Paul Moore,
	Serge E . Hallyn, Justin Suess, Lennart Poettering,
	Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
	kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260312100444.2609563-2-mic@digikod.net>

On Thu, Mar 12, 2026 at 11:04:34AM +0100, Mickaël Salaün wrote:
> From: Christian Brauner <brauner@kernel.org>
> 
> All namespace types now share the same ns_common infrastructure. Extend
> this to include a security blob so LSMs can start managing namespaces
> uniformly without having to add one-off hooks or security fields to
> every individual namespace type.
> 
> Add a ns_security pointer to ns_common and the corresponding lbs_ns
> blob size to lsm_blob_sizes. Allocation and freeing hooks are called
> from the common __ns_common_init() and __ns_common_free() paths so
> every namespace type gets covered in one go. All information about the
> namespace type and the appropriate casting helpers to get at the
> containing namespace are available via ns_common making it
> straightforward for LSMs to differentiate when they need to.
> 
> A namespace_install hook is called from validate_ns() during setns(2)
> giving LSMs a chance to enforce policy on namespace transitions.
> 
> Individual namespace types can still have their own specialized security
> hooks when needed. This is just the common baseline that makes it easy
> to track and manage namespaces from the security side without requiring
> every namespace type to reinvent the wheel.
> 
> Cc: Günther Noack <gnoack@google.com>
> Cc: Paul Moore <paul@paul-moore.com>
> Cc: Serge E. Hallyn <serge@hallyn.com>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> Link: https://lore.kernel.org/r/20260216-work-security-namespace-v1-1-075c28758e1f@kernel.org
> ---
>  include/linux/lsm_hook_defs.h      |  3 ++
>  include/linux/lsm_hooks.h          |  1 +
>  include/linux/ns/ns_common_types.h |  3 ++
>  include/linux/security.h           | 20 ++++++++
>  kernel/nscommon.c                  | 12 +++++
>  kernel/nsproxy.c                   |  8 +++-
>  security/lsm_init.c                |  2 +
>  security/security.c                | 76 ++++++++++++++++++++++++++++++
>  8 files changed, 124 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> index 8c42b4bde09c..fefd3aa6d8f4 100644
> --- a/include/linux/lsm_hook_defs.h
> +++ b/include/linux/lsm_hook_defs.h
> @@ -260,6 +260,9 @@ LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2,
>  LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p,
>  	 struct inode *inode)
>  LSM_HOOK(int, 0, userns_create, const struct cred *cred)
> +LSM_HOOK(int, 0, namespace_alloc, struct ns_common *ns)
> +LSM_HOOK(void, LSM_RET_VOID, namespace_free, struct ns_common *ns)
> +LSM_HOOK(int, 0, namespace_install, const struct nsset *nsset, struct ns_common *ns)
>  LSM_HOOK(int, 0, ipc_permission, struct kern_ipc_perm *ipcp, short flag)
>  LSM_HOOK(void, LSM_RET_VOID, ipc_getlsmprop, struct kern_ipc_perm *ipcp,
>  	 struct lsm_prop *prop)
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index d48bf0ad26f4..3e7afe76e86c 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -111,6 +111,7 @@ struct lsm_blob_sizes {
>  	unsigned int lbs_ipc;
>  	unsigned int lbs_key;
>  	unsigned int lbs_msg_msg;
> +	unsigned int lbs_ns;
>  	unsigned int lbs_perf_event;
>  	unsigned int lbs_task;
>  	unsigned int lbs_xattr_count; /* num xattr slots in new_xattrs array */
> diff --git a/include/linux/ns/ns_common_types.h b/include/linux/ns/ns_common_types.h
> index 0014fbc1c626..170288e2e895 100644
> --- a/include/linux/ns/ns_common_types.h
> +++ b/include/linux/ns/ns_common_types.h
> @@ -115,6 +115,9 @@ struct ns_common {
>  	struct dentry *stashed;
>  	const struct proc_ns_operations *ops;
>  	unsigned int inum;
> +#ifdef CONFIG_SECURITY
> +	void *ns_security;
> +#endif
>  	union {
>  		struct ns_tree;
>  		struct rcu_head ns_rcu;
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 83a646d72f6f..611b9098367d 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -67,6 +67,7 @@ enum fs_value_type;
>  struct watch;
>  struct watch_notification;
>  struct lsm_ctx;
> +struct nsset;
>  
>  /* Default (no) options for the capable function */
>  #define CAP_OPT_NONE 0x0
> @@ -80,6 +81,7 @@ struct lsm_ctx;
>  
>  struct ctl_table;
>  struct audit_krule;
> +struct ns_common;
>  struct user_namespace;
>  struct timezone;
>  
> @@ -533,6 +535,9 @@ int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>  			unsigned long arg4, unsigned long arg5);
>  void security_task_to_inode(struct task_struct *p, struct inode *inode);
>  int security_create_user_ns(const struct cred *cred);
> +int security_namespace_alloc(struct ns_common *ns);
> +void security_namespace_free(struct ns_common *ns);
> +int security_namespace_install(const struct nsset *nsset, struct ns_common *ns);
>  int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag);
>  void security_ipc_getlsmprop(struct kern_ipc_perm *ipcp, struct lsm_prop *prop);
>  int security_msg_msg_alloc(struct msg_msg *msg);
> @@ -1407,6 +1412,21 @@ static inline int security_create_user_ns(const struct cred *cred)
>  	return 0;
>  }
>  
> +static inline int security_namespace_alloc(struct ns_common *ns)
> +{
> +	return 0;
> +}
> +
> +static inline void security_namespace_free(struct ns_common *ns)
> +{
> +}
> +
> +static inline int security_namespace_install(const struct nsset *nsset,
> +					     struct ns_common *ns)
> +{
> +	return 0;
> +}
> +
>  static inline int security_ipc_permission(struct kern_ipc_perm *ipcp,
>  					  short flag)
>  {
> diff --git a/kernel/nscommon.c b/kernel/nscommon.c
> index bdc3c86231d3..de774e374f9d 100644
> --- a/kernel/nscommon.c
> +++ b/kernel/nscommon.c
> @@ -4,6 +4,7 @@
>  #include <linux/ns_common.h>
>  #include <linux/nstree.h>
>  #include <linux/proc_ns.h>
> +#include <linux/security.h>
>  #include <linux/user_namespace.h>
>  #include <linux/vfsdebug.h>
>  
> @@ -59,6 +60,9 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
>  
>  	refcount_set(&ns->__ns_ref, 1);
>  	ns->stashed = NULL;
> +#ifdef CONFIG_SECURITY
> +	ns->ns_security = NULL;
> +#endif
>  	ns->ops = ops;
>  	ns->ns_id = 0;
>  	ns->ns_type = ns_type;
> @@ -77,6 +81,13 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
>  		ret = proc_alloc_inum(&ns->inum);
>  	if (ret)
>  		return ret;
> +
> +	ret = security_namespace_alloc(ns);
> +	if (ret) {
> +		proc_free_inum(ns->inum);
> +		return ret;
> +	}
> +
>  	/*
>  	 * Tree ref starts at 0. It's incremented when namespace enters
>  	 * active use (installed in nsproxy) and decremented when all
> @@ -91,6 +102,7 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
>  
>  void __ns_common_free(struct ns_common *ns)
>  {
> +	security_namespace_free(ns);
>  	proc_free_inum(ns->inum);
>  }
>  
> diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> index 259c4b4f1eeb..f0b30d1907e7 100644
> --- a/kernel/nsproxy.c
> +++ b/kernel/nsproxy.c
> @@ -379,7 +379,13 @@ static int prepare_nsset(unsigned flags, struct nsset *nsset)
>  
>  static inline int validate_ns(struct nsset *nsset, struct ns_common *ns)
>  {
> -	return ns->ops->install(nsset, ns);
> +	int ret;
> +
> +	ret = ns->ops->install(nsset, ns);
> +	if (ret)
> +		return ret;
> +
> +	return security_namespace_install(nsset, ns);
>  }
>  
>  /*
> diff --git a/security/lsm_init.c b/security/lsm_init.c
> index 573e2a7250c4..637c2d65e131 100644
> --- a/security/lsm_init.c
> +++ b/security/lsm_init.c
> @@ -301,6 +301,7 @@ static void __init lsm_prepare(struct lsm_info *lsm)
>  	lsm_blob_size_update(&blobs->lbs_ipc, &blob_sizes.lbs_ipc);
>  	lsm_blob_size_update(&blobs->lbs_key, &blob_sizes.lbs_key);
>  	lsm_blob_size_update(&blobs->lbs_msg_msg, &blob_sizes.lbs_msg_msg);
> +	lsm_blob_size_update(&blobs->lbs_ns, &blob_sizes.lbs_ns);
>  	lsm_blob_size_update(&blobs->lbs_perf_event,
>  			     &blob_sizes.lbs_perf_event);
>  	lsm_blob_size_update(&blobs->lbs_sock, &blob_sizes.lbs_sock);
> @@ -446,6 +447,7 @@ int __init security_init(void)
>  		lsm_pr("blob(ipc) size %d\n", blob_sizes.lbs_ipc);
>  		lsm_pr("blob(key) size %d\n", blob_sizes.lbs_key);
>  		lsm_pr("blob(msg_msg)_size %d\n", blob_sizes.lbs_msg_msg);
> +		lsm_pr("blob(ns) size %d\n", blob_sizes.lbs_ns);
>  		lsm_pr("blob(sock) size %d\n", blob_sizes.lbs_sock);
>  		lsm_pr("blob(superblock) size %d\n", blob_sizes.lbs_superblock);
>  		lsm_pr("blob(perf_event) size %d\n", blob_sizes.lbs_perf_event);
> diff --git a/security/security.c b/security/security.c
> index 67af9228c4e9..dcf073cac848 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -26,6 +26,7 @@
>  #include <linux/string.h>
>  #include <linux/xattr.h>
>  #include <linux/msg.h>
> +#include <linux/ns_common.h>
>  #include <linux/overflow.h>
>  #include <linux/perf_event.h>
>  #include <linux/fs.h>
> @@ -355,6 +356,19 @@ static int lsm_superblock_alloc(struct super_block *sb)
>  			      GFP_KERNEL);
>  }
>  
> +/**
> + * lsm_ns_alloc - allocate a composite namespace blob
> + * @ns: the namespace that needs a blob
> + *
> + * Allocate the namespace blob for all the modules
> + *
> + * Returns 0, or -ENOMEM if memory can't be allocated.
> + */
> +static int lsm_ns_alloc(struct ns_common *ns)
> +{
> +	return lsm_blob_alloc(&ns->ns_security, blob_sizes.lbs_ns, GFP_KERNEL);
> +}
> +
>  /**
>   * lsm_fill_user_ctx - Fill a user space lsm_ctx structure
>   * @uctx: a userspace LSM context to be filled
> @@ -3255,6 +3269,68 @@ int security_create_user_ns(const struct cred *cred)
>  	return call_int_hook(userns_create, cred);
>  }
>  
> +/**
> + * security_namespace_alloc() - Allocate LSM security data for a namespace
> + * @ns: the namespace being allocated
> + *
> + * Allocate and attach security data to the namespace. The namespace type
> + * is available via ns->ns_type, and the owning user namespace (if any)
> + * via ns->ops->owner(ns).
> + *
> + * Return: Returns 0 if successful, otherwise < 0 error code.
> + */
> +int security_namespace_alloc(struct ns_common *ns)
> +{
> +	int rc;
> +
> +	rc = lsm_ns_alloc(ns);
> +	if (unlikely(rc))
> +		return rc;
> +
> +	rc = call_int_hook(namespace_alloc, ns);
> +	if (unlikely(rc))
> +		security_namespace_free(ns);
> +
> +	return rc;
> +}
> +
> +/**
> + * security_namespace_free() - Release LSM security data from a namespace
> + * @ns: the namespace being freed
> + *
> + * Release security data attached to the namespace. Called before the
> + * namespace structure is freed.
> + *
> + * Note: The namespace may be freed via kfree_rcu(). LSMs must use
> + * RCU-safe freeing for any data that might be accessed by concurrent
> + * RCU readers.
> + */
> +void security_namespace_free(struct ns_common *ns)
> +{
> +	if (!ns->ns_security)
> +		return;
> +
> +	call_void_hook(namespace_free, ns);
> +
> +	kfree(ns->ns_security);
> +	ns->ns_security = NULL;
> +}
> +
> +/**
> + * security_namespace_install() - Check permission to install a namespace
> + * @nsset: the target nsset being configured
> + * @ns: the namespace being installed
> + *
> + * Check permission before allowing a namespace to be installed into the
> + * process's set of namespaces via setns(2).
> + *
> + * Return: Returns 0 if permission is granted, otherwise < 0 error code.
> + */
> +int security_namespace_install(const struct nsset *nsset, struct ns_common *ns)
> +{
> +	return call_int_hook(namespace_install, nsset, ns);
> +}
> +
>  /**
>   * security_ipc_permission() - Check if sysv ipc access is allowed
>   * @ipcp: ipc permission structure
> -- 
> 2.53.0
> 

Reviewed-by: Günther Noack <gnoack3000@gmail.com>

^ permalink raw reply

* Re: [RFC PATCH v1 02/11] security: Add LSM_AUDIT_DATA_NS for namespace audit records
From: Günther Noack @ 2026-04-22 21:21 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Paul Moore,
	Serge E . Hallyn, Justin Suess, Lennart Poettering,
	Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
	kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260312100444.2609563-3-mic@digikod.net>

On Thu, Mar 12, 2026 at 11:04:35AM +0100, Mickaël Salaün wrote:
> Add a new LSM audit data type LSM_AUDIT_DATA_NS that logs namespace
> information in audit records.  Two fields are provided, matching the
> field names of struct ns_common:
> 
> - ns_type: the CLONE_NEW* flag identifying the namespace type, logged in
>   hexadecimal.
> 
> - inum: the proc inode number identifying a specific namespace instance.
>   Namespace inode numbers are allocated by proc_alloc_inum() via
>   ida_alloc_max() bounded to UINT_MAX, so the value always fits in 32
>   bits.
> 
> A new audit data type is needed because no existing LSM_AUDIT_DATA_*
> type carries namespace information.  The closest alternatives (e.g.
> LSM_AUDIT_DATA_TASK or LSM_AUDIT_DATA_NONE with custom strings) would
> either lose the namespace type or require ad-hoc formatting that
> bypasses the structured audit data union.
> 
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Günther Noack <gnoack@google.com>
> Cc: Paul Moore <paul@paul-moore.com>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
>  include/linux/lsm_audit.h | 5 +++++
>  security/lsm_audit.c      | 4 ++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
> index 382c56a97bba..6e20a56b8c22 100644
> --- a/include/linux/lsm_audit.h
> +++ b/include/linux/lsm_audit.h
> @@ -78,6 +78,7 @@ struct common_audit_data {
>  #define LSM_AUDIT_DATA_NOTIFICATION 16
>  #define LSM_AUDIT_DATA_ANONINODE	17
>  #define LSM_AUDIT_DATA_NLMSGTYPE	18
> +#define LSM_AUDIT_DATA_NS		19
>  	union 	{
>  		struct path path;
>  		struct dentry *dentry;
> @@ -100,6 +101,10 @@ struct common_audit_data {
>  		int reason;
>  		const char *anonclass;
>  		u16 nlmsg_type;
> +		struct {
> +			u32 ns_type;
> +			unsigned int inum;
> +		} ns;
>  	} u;
>  	/* this union contains LSM specific data */
>  	union {
> diff --git a/security/lsm_audit.c b/security/lsm_audit.c
> index 7d623b00495c..7f71a77c1c12 100644
> --- a/security/lsm_audit.c
> +++ b/security/lsm_audit.c
> @@ -403,6 +403,10 @@ void audit_log_lsm_data(struct audit_buffer *ab,
>  	case LSM_AUDIT_DATA_NLMSGTYPE:
>  		audit_log_format(ab, " nl-msgtype=%hu", a->u.nlmsg_type);
>  		break;
> +	case LSM_AUDIT_DATA_NS:
> +		audit_log_format(ab, " namespace_type=0x%x namespace_inum=%u",
> +				 a->u.ns.ns_type, a->u.ns.inum);
> +		break;
>  	} /* switch (a->type) */
>  }
>  
> -- 
> 2.53.0
> 

Reviewed-by: Günther Noack <gnoack3000@gmail.com>

^ permalink raw reply

* Re: [RFC PATCH v1 04/11] landlock: Wrap per-layer access masks in struct layer_rights
From: Günther Noack @ 2026-04-22 21:29 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Paul Moore,
	Serge E . Hallyn, Justin Suess, Lennart Poettering,
	Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
	kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260312100444.2609563-5-mic@digikod.net>

On Thu, Mar 12, 2026 at 11:04:37AM +0100, Mickaël Salaün wrote:
> The per-layer FAM in struct landlock_ruleset currently stores struct
> access_masks directly, but upcoming permission features (capability
> and namespace restrictions) need additional per-layer data beyond the
> handled-access bitfields.
> 
> Introduce struct layer_rights as a wrapper around struct access_masks
> and rename the FAM from access_masks[] to layers[].  This makes room
> for future per-layer fields (e.g. allowed bitmasks) without modifying
> struct access_masks itself, which is also used as a lightweight
> parameter type for functions that only need the handled-access
> bitfields.
> 
> No functional change.
> 
> Cc: Günther Noack <gnoack@google.com>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
>  security/landlock/access.h   | 29 ++++++++++++++++++++++-------
>  security/landlock/cred.h     |  2 +-
>  security/landlock/ruleset.c  | 12 ++++++------
>  security/landlock/ruleset.h  | 28 +++++++++++++++-------------
>  security/landlock/syscalls.c |  2 +-
>  5 files changed, 45 insertions(+), 28 deletions(-)
> 
> diff --git a/security/landlock/access.h b/security/landlock/access.h
> index 42c95747d7bd..b3e147771a0e 100644
> --- a/security/landlock/access.h
> +++ b/security/landlock/access.h
> @@ -19,7 +19,7 @@
>  
>  /*
>   * All access rights that are denied by default whether they are handled or not
> - * by a ruleset/layer.  This must be ORed with all ruleset->access_masks[]
> + * by a ruleset/layer.  This must be ORed with all ruleset->layers[]
>   * entries when we need to get the absolute handled access masks, see
>   * landlock_upgrade_handled_access_masks().

Nit: It doesn't get ORed with the ruleset->layers[] entries, but with
the access field within them.  Suggestion:

  This must be ORed with the access field in all ruleset->layers[] entries...


>   */
> @@ -45,7 +45,7 @@ static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_SCOPE);
>  /* Makes sure for_each_set_bit() and for_each_clear_bit() calls are OK. */
>  static_assert(sizeof(unsigned long) >= sizeof(access_mask_t));
>  
> -/* Ruleset access masks. */
> +/* Handled access masks (bitfields only). */
>  struct access_masks {
>  	access_mask_t fs : LANDLOCK_NUM_ACCESS_FS;
>  	access_mask_t net : LANDLOCK_NUM_ACCESS_NET;
> @@ -61,6 +61,21 @@ union access_masks_all {
>  static_assert(sizeof(typeof_member(union access_masks_all, masks)) ==
>  	      sizeof(typeof_member(union access_masks_all, all)));
>  
> +/**
> + * struct layer_rights - Per-layer access configuration
> + *
> + * Wraps the handled-access bitfields together with any additional per-layer
> + * data (e.g. allowed bitmasks added by future patches).  This is the element
> + * type of the &struct landlock_ruleset.layers FAM.
> + */
> +struct layer_rights {
> +	/**
> +	 * @handled: Bitmask of access rights handled (i.e. restricted) by
> +	 * this layer.
> +	 */
> +	struct access_masks handled;
> +};
> +
>  /**
>   * struct layer_access_masks - A boolean matrix of layers and access rights
>   *
> @@ -100,17 +115,17 @@ static_assert(BITS_PER_TYPE(deny_masks_t) >=
>  static_assert(HWEIGHT(LANDLOCK_MAX_NUM_LAYERS) == 1);
>  
>  /* Upgrades with all initially denied by default access rights. */
> -static inline struct access_masks
> -landlock_upgrade_handled_access_masks(struct access_masks access_masks)
> +static inline struct layer_rights
> +landlock_upgrade_handled_access_masks(struct layer_rights layer_rights)
                            ^^^^^^^^^^^^

Now that this is taking "layer_rights" not access_masks, is this still
the right function name?

>  {
>  	/*
>  	 * All access rights that are denied by default whether they are
>  	 * explicitly handled or not.
>  	 */
> -	if (access_masks.fs)
> -		access_masks.fs |= _LANDLOCK_ACCESS_FS_INITIALLY_DENIED;
> +	if (layer_rights.handled.fs)
> +		layer_rights.handled.fs |= _LANDLOCK_ACCESS_FS_INITIALLY_DENIED;
>  
> -	return access_masks;
> +	return layer_rights;
>  }
>  
>  /* Checks the subset relation between access masks. */
> diff --git a/security/landlock/cred.h b/security/landlock/cred.h
> index f287c56b5fd4..3e2a7e88710e 100644
> --- a/security/landlock/cred.h
> +++ b/security/landlock/cred.h
> @@ -139,7 +139,7 @@ landlock_get_applicable_subject(const struct cred *const cred,
>  	for (layer_level = domain->num_layers - 1; layer_level >= 0;
>  	     layer_level--) {
>  		union access_masks_all layer = {
> -			.masks = domain->access_masks[layer_level],
> +			.masks = domain->layers[layer_level].handled,
>  		};
>  
>  		if (layer.all & masks_all.all) {
> diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
> index 181df7736bb9..a7f8be37ec31 100644
> --- a/security/landlock/ruleset.c
> +++ b/security/landlock/ruleset.c
> @@ -32,7 +32,7 @@ static struct landlock_ruleset *create_ruleset(const u32 num_layers)
>  {
>  	struct landlock_ruleset *new_ruleset;
>  
> -	new_ruleset = kzalloc_flex(*new_ruleset, access_masks, num_layers,
> +	new_ruleset = kzalloc_flex(*new_ruleset, layers, num_layers,
>  				   GFP_KERNEL_ACCOUNT);
>  	if (!new_ruleset)
>  		return ERR_PTR(-ENOMEM);
> @@ -48,7 +48,7 @@ static struct landlock_ruleset *create_ruleset(const u32 num_layers)
>  	/*
>  	 * hierarchy = NULL
>  	 * num_rules = 0
> -	 * access_masks[] = 0
> +	 * layers[] = 0
>  	 */
>  	return new_ruleset;
>  }
> @@ -381,8 +381,8 @@ static int merge_ruleset(struct landlock_ruleset *const dst,
>  		err = -EINVAL;
>  		goto out_unlock;
>  	}
> -	dst->access_masks[dst->num_layers - 1] =
> -		landlock_upgrade_handled_access_masks(src->access_masks[0]);
> +	dst->layers[dst->num_layers - 1] =
> +		landlock_upgrade_handled_access_masks(src->layers[0]);
>  
>  	/* Merges the @src inode tree. */
>  	err = merge_tree(dst, src, LANDLOCK_KEY_INODE);
> @@ -464,8 +464,8 @@ static int inherit_ruleset(struct landlock_ruleset *const parent,
>  		goto out_unlock;
>  	}
>  	/* Copies the parent layer stack and leaves a space for the new layer. */
> -	memcpy(child->access_masks, parent->access_masks,
> -	       flex_array_size(parent, access_masks, parent->num_layers));
> +	memcpy(child->layers, parent->layers,
> +	       flex_array_size(parent, layers, parent->num_layers));
>  
>  	if (WARN_ON_ONCE(!parent->hierarchy)) {
>  		err = -EINVAL;
> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
> index 889f4b30301a..900c47eb0216 100644
> --- a/security/landlock/ruleset.h
> +++ b/security/landlock/ruleset.h
> @@ -146,7 +146,7 @@ struct landlock_ruleset {
>  		 * section.  This is only used by
>  		 * landlock_put_ruleset_deferred() when @usage reaches zero.
>  		 * The fields @lock, @usage, @num_rules, @num_layers and
> -		 * @access_masks are then unused.
> +		 * @layers are then unused.
>  		 */
>  		struct work_struct work_free;
>  		struct {
> @@ -173,9 +173,10 @@ struct landlock_ruleset {
>  			 */
>  			u32 num_layers;
>  			/**
> -			 * @access_masks: Contains the subset of filesystem and
> -			 * network actions that are restricted by a ruleset.
> -			 * A domain saves all layers of merged rulesets in a
> +			 * @layers: Per-layer access configuration, including
> +			 * handled access masks and allowed permission
> +			 * bitmasks.  A domain saves all layers of merged
> +			 * rulesets in a
                           ^^^^^^^^^^^^^
Nit: Unconventional line break

>  			 * stack (FAM), starting from the first layer to the
>  			 * last one.  These layers are used when merging
>  			 * rulesets, for user space backward compatibility
> @@ -184,7 +185,7 @@ struct landlock_ruleset {
>  			 * layers are set once and never changed for the
>  			 * lifetime of the ruleset.
>  			 */
> -			struct access_masks access_masks[];
> +			struct layer_rights layers[] __counted_by(num_layers);

Thanks for adding __counted_by() 🏆

>  		};
>  	};
>  };
> @@ -224,7 +225,8 @@ static inline void landlock_get_ruleset(struct landlock_ruleset *const ruleset)
>   *
>   * @domain: Landlock ruleset (used as a domain)
>   *
> - * Return: An access_masks result of the OR of all the domain's access masks.
> + * Return: An access_masks result of the OR of all the domain's handled access
> + * masks.
>   */
>  static inline struct access_masks
>  landlock_union_access_masks(const struct landlock_ruleset *const domain)
> @@ -234,7 +236,7 @@ landlock_union_access_masks(const struct landlock_ruleset *const domain)
>  
>  	for (layer_level = 0; layer_level < domain->num_layers; layer_level++) {
>  		union access_masks_all layer = {
> -			.masks = domain->access_masks[layer_level],
> +			.masks = domain->layers[layer_level].handled,
>  		};
>  
>  		matches.all |= layer.all;
> @@ -252,7 +254,7 @@ landlock_add_fs_access_mask(struct landlock_ruleset *const ruleset,
>  
>  	/* Should already be checked in sys_landlock_create_ruleset(). */
>  	WARN_ON_ONCE(fs_access_mask != fs_mask);
> -	ruleset->access_masks[layer_level].fs |= fs_mask;
> +	ruleset->layers[layer_level].handled.fs |= fs_mask;
>  }
>  
>  static inline void
> @@ -264,7 +266,7 @@ landlock_add_net_access_mask(struct landlock_ruleset *const ruleset,
>  
>  	/* Should already be checked in sys_landlock_create_ruleset(). */
>  	WARN_ON_ONCE(net_access_mask != net_mask);
> -	ruleset->access_masks[layer_level].net |= net_mask;
> +	ruleset->layers[layer_level].handled.net |= net_mask;
>  }
>  
>  static inline void
> @@ -275,7 +277,7 @@ landlock_add_scope_mask(struct landlock_ruleset *const ruleset,
>  
>  	/* Should already be checked in sys_landlock_create_ruleset(). */
>  	WARN_ON_ONCE(scope_mask != mask);
> -	ruleset->access_masks[layer_level].scope |= mask;
> +	ruleset->layers[layer_level].handled.scope |= mask;
>  }
>  
>  static inline access_mask_t
> @@ -283,7 +285,7 @@ landlock_get_fs_access_mask(const struct landlock_ruleset *const ruleset,
>  			    const u16 layer_level)
>  {
>  	/* Handles all initially denied by default access rights. */
> -	return ruleset->access_masks[layer_level].fs |
> +	return ruleset->layers[layer_level].handled.fs |
>  	       _LANDLOCK_ACCESS_FS_INITIALLY_DENIED;
>  }
>  
> @@ -291,14 +293,14 @@ static inline access_mask_t
>  landlock_get_net_access_mask(const struct landlock_ruleset *const ruleset,
>  			     const u16 layer_level)
>  {
> -	return ruleset->access_masks[layer_level].net;
> +	return ruleset->layers[layer_level].handled.net;
>  }
>  
>  static inline access_mask_t
>  landlock_get_scope_mask(const struct landlock_ruleset *const ruleset,
>  			const u16 layer_level)
>  {
> -	return ruleset->access_masks[layer_level].scope;
> +	return ruleset->layers[layer_level].handled.scope;
>  }
>  
>  bool landlock_unmask_layers(const struct landlock_rule *const rule,
> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
> index 3b33839b80c7..2aa7b50d875f 100644
> --- a/security/landlock/syscalls.c
> +++ b/security/landlock/syscalls.c
> @@ -341,7 +341,7 @@ static int add_rule_path_beneath(struct landlock_ruleset *const ruleset,
>  		return -ENOMSG;
>  
>  	/* Checks that allowed_access matches the @ruleset constraints. */
> -	mask = ruleset->access_masks[0].fs;
> +	mask = ruleset->layers[0].handled.fs;
>  	if ((path_beneath_attr.allowed_access | mask) != mask)
>  		return -EINVAL;
>  
> -- 
> 2.53.0
> 

^ permalink raw reply

* Re: [RFC PATCH v1 06/11] landlock: Enforce capability restrictions
From: Günther Noack @ 2026-04-22 21:36 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Paul Moore,
	Serge E . Hallyn, Justin Suess, Lennart Poettering,
	Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
	kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260312100444.2609563-7-mic@digikod.net>

Hello!

On Thu, Mar 12, 2026 at 11:04:39AM +0100, Mickaël Salaün wrote:
> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
> index 152d952e98f6..38a4bf92781a 100644
> --- a/security/landlock/syscalls.c
> +++ b/security/landlock/syscalls.c
> [...]

> +	/*
> +	 * Stores only the capabilities this kernel knows about.
> +	 * Unknown bits are silently accepted for forward compatibility:
> +	 * user space compiled against newer headers can pass new
> +	 * CAP_* bits without getting EINVAL on older kernels.
> +	 * Unknown bits have no effect because no hook checks them.
> +	 */
> +	mutex_lock(&ruleset->lock);
> +	ruleset->layers[0].allowed.caps |=
> +		landlock_caps_to_bits(cap_attr.capabilities & CAP_VALID_MASK);
> +	mutex_unlock(&ruleset->lock);

See the example in the documentation patch set [1]; I think it can be
an incompatibility if we ignore the unknown bits here (and I don't
know of a scenario where it would be a problem to reject them).

[1] https://lore.kernel.org/all/20260422.5a7059c06fb0@gnoack.org/

–Günther

^ permalink raw reply

* Re: [PATCH] apparmor: Fix two bugs of aa_setup_dfa_engine's fail handling
From: Georgia Garcia @ 2026-04-22 21:51 UTC (permalink / raw)
  To: GONG Ruiqi, John Johansen, Paul Moore, James Morris,
	Serge E . Hallyn
  Cc: apparmor, linux-security-module, linux-kernel, lujialin4
In-Reply-To: <20260403035119.2132418-1-gongruiqi1@huawei.com>

Hello,

On Fri, 2026-04-03 at 11:51 +0800, GONG Ruiqi wrote:
> First, aa_dfa_unpack returns ERR_PTR not NULL when it fails, but
> aa_put_dfa only checks NULL for its input, which would cause invalid
> memory access in aa_put_dfa. Set nulldfa to NULL explicitly to fix that.
> 
> Second, aa_put_pdb calls aa_pdb_free_kref -> aa_free_pdb -> aa_put_dfa,
> i.e.  it will free nullpdb->dfa. But there's another aa_put_dfa(nulldfa)
> after aa_put_pdb(nullpdb), which would cause double free. Remove that
> redundant aa_put_dfa to fix that.
> 
> Fixes: 98b824ff8984 ("apparmor: refcount the pdb")
> Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com>
> ---
>  security/apparmor/lsm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> index c1d42fc72fdb..be82ec1b9fd9 100644
> --- a/security/apparmor/lsm.c
> +++ b/security/apparmor/lsm.c
> @@ -2465,6 +2465,7 @@ static int __init aa_setup_dfa_engine(void)
>  			    TO_ACCEPT2_FLAG(YYTD_DATA32));
>  	if (IS_ERR(nulldfa)) {
>  		error = PTR_ERR(nulldfa);
> +		nulldfa = NULL;
>  		goto fail;
>  	}
>  	nullpdb->dfa = aa_get_dfa(nulldfa);
> @@ -2486,7 +2487,6 @@ static int __init aa_setup_dfa_engine(void)
>  
>  fail:
>  	aa_put_pdb(nullpdb);
> -	aa_put_dfa(nulldfa);

This isn't right. aa_dfa_unpack does kref_init(&dfa->count), and later
we have nullpdb->dfa = aa_get_dfa(nulldfa);
So the second is put on aa_put_pdb but the first, from the init, does
need to be put too.

>  	nullpdb = NULL;
>  	nulldfa = NULL;
>  	stacksplitdfa = NULL;


^ permalink raw reply

* Re: [apparmor] [PATCH RESEND] apparmor: Fix string overrun due to missing termination
From: Georgia Garcia @ 2026-04-22 22:41 UTC (permalink / raw)
  To: Daniel J Blueman, John Johansen, Paul Moore, James Morris,
	Serge E. Hallyn, Thorsten Blum, apparmor, linux-security-module
  Cc: linux-kernel, stable
In-Reply-To: <20260327115833.7572-1-daniel@quora.org>

Hello,

On Fri, 2026-03-27 at 19:58 +0800, Daniel J Blueman wrote:
> This was introduced by previous incorrect conversion from strcpy(). Fix it
> by adding the missing terminator.
> 

Looks good to me,

Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com>

> Cc: stable@vger.kernel.org
> Signed-off-by: Daniel J Blueman <daniel@quora.org>
> Fixes: 93d4dbdc8da0 ("apparmor: Replace deprecated strcpy in d_namespace_path")
> ---
>  security/apparmor/path.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/security/apparmor/path.c b/security/apparmor/path.c
> index 65a0ca5cc1bd..2494e8101538 100644
> --- a/security/apparmor/path.c
> +++ b/security/apparmor/path.c
> @@ -164,14 +164,16 @@ static int d_namespace_path(const struct path *path, char *buf, char **name,
>  	}
>  
>  out:
> -	/* Append "/" to directory paths, except for root "/" which
> -	 * already ends in a slash.
> +	/* Append "/" to directory paths and reterminate string, except for
> +	 * root "/" which already ends in a slash.
>  	 */
>  	if (!error && isdir) {
>  		bool is_root = (*name)[0] == '/' && (*name)[1] == '\0';
>  
> -		if (!is_root)
> +		if (!is_root) {
>  			buf[aa_g_path_max - 2] = '/';
> +			buf[aa_g_path_max - 1] = '\0';
> +		}
>  	}
>  
>  	return error;
> --
> 2.53.0


^ permalink raw reply

* Re: [RFC PATCH v1 01/11] security: add LSM blob and hooks for namespaces
From: Paul Moore @ 2026-04-23  0:19 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Serge E . Hallyn,
	Justin Suess, Lennart Poettering, Mikhail Ivanov,
	Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang, kernel-team,
	linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260312100444.2609563-2-mic@digikod.net>

On Thu, Mar 12, 2026 at 6:05 AM Mickaël Salaün <mic@digikod.net> wrote:
>
> From: Christian Brauner <brauner@kernel.org>
>
> All namespace types now share the same ns_common infrastructure. Extend
> this to include a security blob so LSMs can start managing namespaces
> uniformly without having to add one-off hooks or security fields to
> every individual namespace type.
>
> Add a ns_security pointer to ns_common and the corresponding lbs_ns
> blob size to lsm_blob_sizes. Allocation and freeing hooks are called
> from the common __ns_common_init() and __ns_common_free() paths so
> every namespace type gets covered in one go. All information about the
> namespace type and the appropriate casting helpers to get at the
> containing namespace are available via ns_common making it
> straightforward for LSMs to differentiate when they need to.
>
> A namespace_install hook is called from validate_ns() during setns(2)
> giving LSMs a chance to enforce policy on namespace transitions.
>
> Individual namespace types can still have their own specialized security
> hooks when needed. This is just the common baseline that makes it easy
> to track and manage namespaces from the security side without requiring
> every namespace type to reinvent the wheel.
>
> Cc: Günther Noack <gnoack@google.com>
> Cc: Paul Moore <paul@paul-moore.com>
> Cc: Serge E. Hallyn <serge@hallyn.com>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> Link: https://lore.kernel.org/r/20260216-work-security-namespace-v1-1-075c28758e1f@kernel.org
> ---
>  include/linux/lsm_hook_defs.h      |  3 ++
>  include/linux/lsm_hooks.h          |  1 +
>  include/linux/ns/ns_common_types.h |  3 ++
>  include/linux/security.h           | 20 ++++++++
>  kernel/nscommon.c                  | 12 +++++
>  kernel/nsproxy.c                   |  8 +++-
>  security/lsm_init.c                |  2 +
>  security/security.c                | 76 ++++++++++++++++++++++++++++++
>  8 files changed, 124 insertions(+), 1 deletion(-)

...

> diff --git a/kernel/nscommon.c b/kernel/nscommon.c
> index bdc3c86231d3..de774e374f9d 100644
> --- a/kernel/nscommon.c
> +++ b/kernel/nscommon.c
> @@ -77,6 +81,13 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
>                 ret = proc_alloc_inum(&ns->inum);
>         if (ret)
>                 return ret;
> +
> +       ret = security_namespace_alloc(ns);
> +       if (ret) {
> +               proc_free_inum(ns->inum);
> +               return ret;
> +       }

Since this is an RFC, I'll make the nitpicky comment that it would be
better if the LSM hook is called security_namespace_init() instead of
security_namespace_alloc().  This fits better with the convention of
aligning with the caller's name, as well as to helps to indicate that
the LSMs will be initializing the LSM state associated with the
ns_common instance.

> diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> index 259c4b4f1eeb..f0b30d1907e7 100644
> --- a/kernel/nsproxy.c
> +++ b/kernel/nsproxy.c
> @@ -379,7 +379,13 @@ static int prepare_nsset(unsigned flags, struct nsset *nsset)
>
>  static inline int validate_ns(struct nsset *nsset, struct ns_common *ns)
>  {
> -       return ns->ops->install(nsset, ns);
> +       int ret;
> +
> +       ret = ns->ops->install(nsset, ns);
> +       if (ret)
> +               return ret;
> +
> +       return security_namespace_install(nsset, ns);
>  }

Do we also want a security_namespace_switch() called from within
switch_task_namespaces()?  Of course LSMs would not be able to fail or
return an error at that point, but it seems reasonable that LSMs might
want to update LSM state associated with the current task once the
namespaces have been changed.  This is similar to all the "_post_" LSM
hooks we have for various operations in the VFS and network layers.

I think we would want to pass both the task_struct and whichever
nsproxy instance is not stored in the task_struct to the hook.  I
prefer placing the hook after the task_struct has been updated, but if
anyone feels strongly that it should be the other way that's okay with
me.

> diff --git a/security/security.c b/security/security.c
> index 67af9228c4e9..dcf073cac848 100644
> --- a/security/security.c
> +++ b/security/security.c
> +/**
> + * security_namespace_free() - Release LSM security data from a namespace
> + * @ns: the namespace being freed
> + *
> + * Release security data attached to the namespace. Called before the
> + * namespace structure is freed.
> + *
> + * Note: The namespace may be freed via kfree_rcu(). LSMs must use
> + * RCU-safe freeing for any data that might be accessed by concurrent
> + * RCU readers.
> + */
> +void security_namespace_free(struct ns_common *ns)
> +{
> +       if (!ns->ns_security)
> +               return;
> +
> +       call_void_hook(namespace_free, ns);
> +
> +       kfree(ns->ns_security);
> +       ns->ns_security = NULL;
> +}

The "namespace may be freed via kfree_rcu()" comment in conjunction
with the standard kfree() in the function above raises a red flag.  Do
we need to take an approach similar to
security_inode_free()/inode_free_by_rcu() here?

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH] apparmor: Fix two bugs of aa_setup_dfa_engine's fail handling
From: GONG Ruiqi @ 2026-04-23  1:52 UTC (permalink / raw)
  To: Georgia Garcia, John Johansen, Paul Moore, James Morris,
	Serge E . Hallyn
  Cc: apparmor, linux-security-module, linux-kernel, lujialin4,
	zhaoyipeng
In-Reply-To: <1b87ab3652ca165364e1bb86623f2b26a135dae7.camel@canonical.com>

Hi Georgia,

On 4/23/2026 5:51 AM, Georgia Garcia wrote:
> ...
>> @@ -2486,7 +2487,6 @@ static int __init aa_setup_dfa_engine(void)
>>  
>>  fail:
>>  	aa_put_pdb(nullpdb);
>> -	aa_put_dfa(nulldfa);
> 
> This isn't right. aa_dfa_unpack does kref_init(&dfa->count), and later
> we have nullpdb->dfa = aa_get_dfa(nulldfa);
> So the second is put on aa_put_pdb but the first, from the init, does
> need to be put too.

Thanks for the feedback, and yes you're right. I didn't notice there's a
kref_init in aa_dfa_unpack...

I will submit a patch that only contains the first fix.

BR,
Ruiqi

> 
>>  	nullpdb = NULL;
>>  	nulldfa = NULL;
>>  	stacksplitdfa = NULL;
> 


^ permalink raw reply

* [PATCH] apparmor/lsm: Fix aa_dfa_unpack's error handling in aa_setup_dfa_engine
From: GONG Ruiqi @ 2026-04-23  3:10 UTC (permalink / raw)
  To: John Johansen, Paul Moore, James Morris, Serge E . Hallyn,
	Georgia Garcia
  Cc: apparmor, linux-security-module, linux-kernel, lujialin4,
	gongruiqi1, zhaoyipeng5

aa_dfa_unpack returns ERR_PTR not NULL when it fails, but aa_put_dfa
only checks NULL for its input, which would cause invalid memory access
in aa_put_dfa. Set nulldfa to NULL explicitly to fix that.

Fixes: 98b824ff8984 ("apparmor: refcount the pdb")
Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com>
---
 security/apparmor/lsm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index c1d42fc72fdb..ead2f07982b6 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -2465,6 +2465,7 @@ static int __init aa_setup_dfa_engine(void)
 			    TO_ACCEPT2_FLAG(YYTD_DATA32));
 	if (IS_ERR(nulldfa)) {
 		error = PTR_ERR(nulldfa);
+		nulldfa = NULL;
 		goto fail;
 	}
 	nullpdb->dfa = aa_get_dfa(nulldfa);
-- 
2.43.0


^ permalink raw reply related

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Yeoreum Yun @ 2026-04-23  5:55 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
	oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	will, noodles, sebastianene
In-Reply-To: <82803bb3b471898a77084c449b73c7f7b4eb2149.camel@linux.ibm.com>

> On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > Hi Mimi,
> > >
> > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > the TPM driver must be built as built-in and
> > > > > must be probed before the IMA subsystem is initialized.
> > > > >
> > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > >
> > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > the CRB interface is probed before IMA initialization,
> > > > > the following conditions must be met:
> > > > >
> > > > >    1. The corresponding ffa_device must be registered,
> > > > >       which is done via ffa_init().
> > > > >
> > > > >    2. The tpm_crb_driver must successfully probe this device via
> > > > >       tpm_crb_ffa_init().
> > > > >
> > > > >    3. The tpm_crb driver using CRB over FF-A can then
> > > > >       be probed successfully. (See crb_acpi_add() and
> > > > >       tpm_crb_ffa_init() for reference.)
> > > > >
> > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > >
> > > > > When this occurs, probing the TPM device is deferred.
> > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > has already been initialized, since IMA initialization is performed
> > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > at the same level.
> > > > >
> > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > log though TPM device presents in the system.
> > > > >
> > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > >
> > > > A lot of change for just detecting whether ima_init() is being called on
> > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > changes (e.g. ima_init_core).
> > > >
> > > > Please just limit the change to just calling ima_init() twice.
> > >
> > > My concern is that ima_update_policy_flags() will be called
> > > when ima_init() is deferred -- not initialised anything.
> > > though functionally, it might be okay however,
> > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > works logically.
> > >
> > > This change I think not much quite a lot. just wrapper ima_init() with
> > > ima_init_core() with some error handling.
> > >
> > > Am I missing something?
> >
> > Also, if we handle in ima_init() only, but it failed with other reason,
> > we shouldn't call again ima_init() in the late_initcall_sync.
> >
> > To handle this, It wouldn't do in the ima_init() but we need to handle
> > it by caller of ima_init().
>
> Only tpm_default_chip() is being called to set the ima_tpm_chip.  On failure,
> instead of going into TPM-bypass mode, return immediately.  There are no calls
> to anything else.  Just call ima_init() a second time.

I’m not fully convinced this is sufficient.

What I meant is the case where ima_init() fails due to other
initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).

If it fails at the late_initcall stage for such reasons, then we
should not call ima_init() again at late_initcall_sync.

For this reason, instead of adding a static variable inside
ima_init(), I think it would be better to manage the state in the
caller and introduce something like an ima_initialised flag. Also, if
initialisation fails for other reasons, the notifier block should be
unregistered.

I’d also like to ask again whether it is fine to call
ima_update_policy_flags() and keep the notifier registered in the
deferred TPM case. While this may be functionally acceptable, it seems
logically questionable to do so when ima_init() has not completed.

There is also a possibility that a deferred case ultimately fails (e.g.
deferred at late_initcall, but then failing at late_initcall_sync
for another reason, even while entering TPM bypass mode). In that case,
it seems more appropriate to handle this state in the caller of
ima_init(), rather than inside ima_init() itself.

Am I still missing something?

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH v2 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Marc Zyngier @ 2026-04-23  8:34 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, jgg,
	sudeep.holla, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, noodles, sebastianene
In-Reply-To: <20260422162449.1814615-5-yeoreum.yun@arm.com>

On Wed, 22 Apr 2026 17:24:49 +0100,
Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> 
> When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> Otherwise, pKVM cannot negotiate the FF-A version or
> obtain RX/TX buffer information, leading to failures in FF-A calls.
> 
> During FF-A driver initialization, check whether pKVM has been initialized.
> If pKVM isn't initailised, register notifier and do initialisation
> of FF-A driver when pKVM is initialized.
> 
> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>  arch/arm64/include/asm/virt.h     | 11 ++++++++++
>  arch/arm64/kvm/arm.c              | 21 ++++++++++++++++++
>  arch/arm64/kvm/pkvm.c             |  2 ++
>  drivers/firmware/arm_ffa/common.h |  4 ++--
>  drivers/firmware/arm_ffa/driver.c | 36 ++++++++++++++++++++++++++++++-
>  drivers/firmware/arm_ffa/smccc.c  |  2 +-
>  6 files changed, 72 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index b51ab6840f9c..ad038a3b8727 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -68,6 +68,8 @@
>  #include <asm/sysreg.h>
>  #include <asm/cpufeature.h>
> 
> +struct notifier_block;
> +
>  /*
>   * __boot_cpu_mode records what mode CPUs were booted in.
>   * A correctly-implemented bootloader must start all CPUs in the same mode:
> @@ -166,6 +168,15 @@ static inline bool is_hyp_nvhe(void)
>  	return is_hyp_mode_available() && !is_kernel_in_hyp_mode();
>  }
> 
> +enum kvm_arm_event {
> +	PKVM_INITIALISED,
> +	KVM_ARM_EVENT_MAX,
> +};

Well, no.

You are adding a whole infrastructure for something that happens
*once* in the lifetime of the system. What's next? D-Bus?

We already have a dependency mechanism, which I pointed to you last
time, and that you conveniently ignored. If that's not working for
you, then consider improving it.

If we had a whole set of in-kernel users depending on some global KVM
state change, we could look into it. But they are none, and all KVM
state changes are per-vcpu rather global.

So I'm not entertaining this invasive infrastructure for something so
limited.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Will Deacon @ 2026-04-23  8:57 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: Sudeep Holla, Marc Zyngier, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, oupton,
	joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	sebastianene
In-Reply-To: <aejN52lwaqfoMuGJ@e129823.arm.com>

On Wed, Apr 22, 2026 at 02:32:23PM +0100, Yeoreum Yun wrote:
> Hi All,
> 
> > > On Tue, Apr 21, 2026 at 07:57:43AM +0100, Yeoreum Yun wrote:
> > >
> > > [...]
> > >
> > > >
> > > > Also, the FF-A initialization is not driven by a device probe, but rather
> > > > happens as part of the bus registration itself,
> > > > so it does not fit well with a device_link or probe deferral based approach.
> > > >
> > > > Instead, perhaps we could go with the idea I mentioned previously:
> > > > either introduce a notifier, or create a pseudo ffa_device
> > > > once pKVM initialization has completed, and
> > > > then let the ffa driver perform the additional initialization from there.
> > > >
> > > > Am I missing something?
> > > >
> > >
> > > In order to handle/cleanup some ugliness in interrupt management in the
> > > FF-A driver, we may introduce DT node eventually. But it will take sometime.
> >
> > Unfortunately, I think this DT node wouldn't be helpful to solve
> > this situation for dependency with the kvm misc device...
> >
> > IMHO, current situation, the notifier seems to good option. unless
> > we make the initcall to recongise this dependency.
> >
> 
> I think the best approach for now is to introduce a notifier to handle this situation.
> If there are no further suggestions, I’ll send a v2 based on:
>   - https://lore.kernel.org/all/aeS4rAeVQ0yJIPYw@e129823.arm.com/

I can't say that I'm a huge fan of that :/

The notifier will literally fire once, for a single listener. That's
called a function call.

Will

^ permalink raw reply

* Re: [RFC PATCH v2 3/4] firmware: arm_ffa: revert ffa_init() initcall level to device_initcall
From: Sudeep Holla @ 2026-04-23  9:13 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, Sudeep Holla,
	linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, serge,
	zohar, roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, jgg,
	maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, noodles, sebastianene
In-Reply-To: <20260422162449.1814615-4-yeoreum.yun@arm.com>

On Wed, Apr 22, 2026 at 05:24:48PM +0100, Yeoreum Yun wrote:
> commit 0e0546eabcd6 ("firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall")
> changed the initcall level of ffa_init() to rootfs_initcall to address
> an issue where IMA could not properly recognize the TPM device.
> 
> However, this introduces a problem: pKVM fails to handle any FF-A calls
> because it cannot trap the FFA_VERSION call invoked by ffa_init().
> 
> Since the IMA init function level has been changed to late_initcall_sync,
> there is no longer a need to keep ffa_init() at rootfs_initcall.
> Revert it back to device_initcall.
> 

I prefer you do actual git revert on the original commit for this as well
as the TPM CRM FFA driver explaining how the original idea fails in certain
conditions. Don't add it as separate commit and add fixes tag to the
original commits.

-- 
Regards,
Sudeep

^ permalink raw reply

* Re: [PATCH v2] evm: terminate and bound the evm_xattrs read buffer
From: Roberto Sassu @ 2026-04-23  9:31 UTC (permalink / raw)
  To: Pengpeng Hou, Mimi Zohar, Roberto Sassu
  Cc: Dmitry Kasatkin, Eric Snowberg, Paul Moore, James Morris,
	Serge Hallyn, linux-integrity, linux-security-module,
	linux-kernel
In-Reply-To: <b28a714c-aabe-49f3-a8ab-274feff34d85@huaweicloud.com>

On Fri, 2026-04-17 at 10:30 +0200, Roberto Sassu wrote:
> On 4/17/2026 2:44 PM, Pengpeng Hou wrote:
> > evm_read_xattrs() allocates size + 1 bytes, fills them from the list of
> > enabled xattrs, and then passes strlen(temp) to
> > simple_read_from_buffer(). When no configured xattrs are enabled, the
> > fill loop stores nothing and temp[0] remains uninitialized, so strlen()
> > reads beyond initialized memory.
> > 
> > Explicitly terminate the buffer after allocation, use snprintf() for
> > each formatted line, and pass the accumulated length to
> 
> pass the accumulate length (without risk of truncation) to ...
> 
> > simple_read_from_buffer().
> > 
> > Fixes: fa516b66a1bf ("EVM: Allow runtime modification of the set of verified xattrs")
> > Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
> > ---
> > Changes since v1:
> > - add the Fixes tag
> > - replace sprintf() with snprintf()
> > - explicitly terminate the buffer instead of switching to kzalloc()
> > 
> >   security/integrity/evm/evm_secfs.c | 11 ++++++-----
> >   1 file changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/security/integrity/evm/evm_secfs.c b/security/integrity/evm/evm_secfs.c
> > index acd840461902..b7882a4ce9d0 100644
> > --- a/security/integrity/evm/evm_secfs.c
> > +++ b/security/integrity/evm/evm_secfs.c
> > @@ -127,8 +127,8 @@ static ssize_t evm_read_xattrs(struct file *filp, char __user *buf,
> >   			       size_t count, loff_t *ppos)
> >   {
> >   	char *temp;
> > -	int offset = 0;
> > -	ssize_t rc, size = 0;
> > +	size_t offset = 0, size = 0;
> > +	ssize_t rc;
> >   	struct xattr_list *xattr;
> >   
> >   	if (*ppos != 0)
> > @@ -150,17 +150,18 @@ static ssize_t evm_read_xattrs(struct file *filp, char __user *buf,
> >   		mutex_unlock(&xattr_list_mutex);
> >   		return -ENOMEM;
> >   	}
> 
> Please add a newline here.
> 
> > +	temp[size] = '\0';
> >   
> >   	list_for_each_entry(xattr, &evm_config_xattrnames, list) {
> >   		if (!xattr->enabled)
> >   			continue;
> >   
> > -		sprintf(temp + offset, "%s\n", xattr->name);
> > -		offset += strlen(xattr->name) + 1;
> 
> Also a comment like:
> 
> /*
>   * No truncation possible: size is computed over the same
>   * enabled xattrs under xattr_list_mutex, so offset never exceeds size.
>   */
> 
> to motivate why it is fine to increment offset without checking.

Any progress? The changes should be straightforward.

Thanks

Roberto

> Thanks
> 
> Roberto
> 
> > +		offset += snprintf(temp + offset, size + 1 - offset, "%s\n",
> > +				   xattr->name);
> >   	}
> >   
> >   	mutex_unlock(&xattr_list_mutex);
> > -	rc = simple_read_from_buffer(buf, count, ppos, temp, strlen(temp));
> > +	rc = simple_read_from_buffer(buf, count, ppos, temp, offset);
> >   
> >   	kfree(temp);
> >   


^ permalink raw reply

* Re: [RFC PATCH v2 2/4] tpm: tpm_crb_ffa: revert defered_probed when tpm_crb_ffa is built-in
From: Jarkko Sakkinen @ 2026-04-23 10:17 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, jgg, sudeep.holla,
	maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, noodles, sebastianene
In-Reply-To: <20260422162449.1814615-3-yeoreum.yun@arm.com>

On Wed, Apr 22, 2026 at 05:24:47PM +0100, Yeoreum Yun wrote:
> commit 746d9e9f62a6 ("tpm: tpm_crb_ffa: try to probe tpm_crb_ffa when it's build_in")
> probe tpm_crb_ffa forcefully when it's built-in to integrate with IMA.
> 
> However, as IMA init function is changed to late_initcall_sync level.
> So, this change isn't required anymore.
> 
> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>  drivers/char/tpm/tpm_crb_ffa.c | 18 +++---------------
>  1 file changed, 3 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm_crb_ffa.c b/drivers/char/tpm/tpm_crb_ffa.c
> index 99f1c1e5644b..025c4d4b17ca 100644
> --- a/drivers/char/tpm/tpm_crb_ffa.c
> +++ b/drivers/char/tpm/tpm_crb_ffa.c
> @@ -177,23 +177,13 @@ static int tpm_crb_ffa_to_linux_errno(int errno)
>   */
>  int tpm_crb_ffa_init(void)
>  {
> -	int ret = 0;
> -
> -	if (!IS_MODULE(CONFIG_TCG_ARM_CRB_FFA)) {
> -		ret = ffa_register(&tpm_crb_ffa_driver);
> -		if (ret) {
> -			tpm_crb_ffa = ERR_PTR(-ENODEV);
> -			return ret;
> -		}
> -	}
> -
>  	if (!tpm_crb_ffa)
> -		ret = -ENOENT;
> +		return -ENOENT;
> 
>  	if (IS_ERR_VALUE(tpm_crb_ffa))
> -		ret = -ENODEV;
> +		return -ENODEV;
> 
> -	return ret;
> +	return 0;
>  }
>  EXPORT_SYMBOL_GPL(tpm_crb_ffa_init);
> 
> @@ -405,9 +395,7 @@ static struct ffa_driver tpm_crb_ffa_driver = {
>  	.id_table = tpm_crb_ffa_device_id,
>  };
> 
> -#ifdef MODULE
>  module_ffa_driver(tpm_crb_ffa_driver);
> -#endif
> 
>  MODULE_AUTHOR("Arm");
>  MODULE_DESCRIPTION("TPM CRB FFA driver");
> --
> LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
> 

I'll hold review to next version i.e. after Mimi's concerns
have been addressed.

BR, Jarkko

^ permalink raw reply

* Re: [RFC PATCH v2 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Yeoreum Yun @ 2026-04-23 10:29 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, jgg,
	sudeep.holla, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, noodles, sebastianene
In-Reply-To: <865x5i13dl.wl-maz@kernel.org>

Hi Marc,

> On Wed, 22 Apr 2026 17:24:49 +0100,
> Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> >
> > When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> > Otherwise, pKVM cannot negotiate the FF-A version or
> > obtain RX/TX buffer information, leading to failures in FF-A calls.
> >
> > During FF-A driver initialization, check whether pKVM has been initialized.
> > If pKVM isn't initailised, register notifier and do initialisation
> > of FF-A driver when pKVM is initialized.
> >
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > ---
> >  arch/arm64/include/asm/virt.h     | 11 ++++++++++
> >  arch/arm64/kvm/arm.c              | 21 ++++++++++++++++++
> >  arch/arm64/kvm/pkvm.c             |  2 ++
> >  drivers/firmware/arm_ffa/common.h |  4 ++--
> >  drivers/firmware/arm_ffa/driver.c | 36 ++++++++++++++++++++++++++++++-
> >  drivers/firmware/arm_ffa/smccc.c  |  2 +-
> >  6 files changed, 72 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> > index b51ab6840f9c..ad038a3b8727 100644
> > --- a/arch/arm64/include/asm/virt.h
> > +++ b/arch/arm64/include/asm/virt.h
> > @@ -68,6 +68,8 @@
> >  #include <asm/sysreg.h>
> >  #include <asm/cpufeature.h>
> >
> > +struct notifier_block;
> > +
> >  /*
> >   * __boot_cpu_mode records what mode CPUs were booted in.
> >   * A correctly-implemented bootloader must start all CPUs in the same mode:
> > @@ -166,6 +168,15 @@ static inline bool is_hyp_nvhe(void)
> >  	return is_hyp_mode_available() && !is_kernel_in_hyp_mode();
> >  }
> >
> > +enum kvm_arm_event {
> > +	PKVM_INITIALISED,
> > +	KVM_ARM_EVENT_MAX,
> > +};
>
> Well, no.
>
> You are adding a whole infrastructure for something that happens
> *once* in the lifetime of the system. What's next? D-Bus?
>
> We already have a dependency mechanism, which I pointed to you last
> time, and that you conveniently ignored. If that's not working for
> you, then consider improving it.
>
> If we had a whole set of in-kernel users depending on some global KVM
> state change, we could look into it. But they are none, and all KVM
> state changes are per-vcpu rather global.
>
> So I'm not entertaining this invasive infrastructure for something so
> limited.

I think I misunderstood your suggestion at first — I wasn’t ignoring it,
and I apologise for that.

I initially considered hooking into /dev/kvm registration,
but there doesn’t seem to be a dedicated class or bus notifier for misc devices:

  - https://lore.kernel.org/all/aecf57rWloQwDh6v@e129823.arm.com/


Also, as I understand it, to make use of device_link,
FF-A would need to represent itself (and pKVM) as proper devices.

However, even if we rely on notifiers for when the pKVM device and
FF-A device are added, the ordering becomes problematic.
When the pKVM device is added and probed, the FF-A consumer would add into
deferred list be device core and  deferred_probe is triggered later
(during late_initcall).

In other words, once FF-A itself is deferred,
the deferred probe queue would look something like:

  (device depending on FF-A) → (some FF-A device) → (FF-A core)

especially since finalise_pkvm() runs at late_initcall_sync.

Unfortunately, deferred_probe_initcall() (also at late_initcall) only
calls driver_deferred_probe_trigger() twice. In this scenario,
the last device in the chain would not be probed immediately but only after a timeout.
As a result, IMA would also fail to find the device in time.

This is why I felt that device_link might not be suitable in this case —
although I may be misunderstanding something.

If this understanding is correct, I’m not sure what alternative we have,
other than adding some kind of notifier support (bus or class) to
the misc driver, or introducing a custom notifier.

Am I missing something?

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Yeoreum Yun @ 2026-04-23 10:56 UTC (permalink / raw)
  To: Will Deacon
  Cc: Sudeep Holla, Marc Zyngier, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, oupton,
	joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	sebastianene
In-Reply-To: <aene4KFD5kbSbFRm@willie-the-truck>

Hi Will,

> On Wed, Apr 22, 2026 at 02:32:23PM +0100, Yeoreum Yun wrote:
> > Hi All,
> >
> > > > On Tue, Apr 21, 2026 at 07:57:43AM +0100, Yeoreum Yun wrote:
> > > >
> > > > [...]
> > > >
> > > > >
> > > > > Also, the FF-A initialization is not driven by a device probe, but rather
> > > > > happens as part of the bus registration itself,
> > > > > so it does not fit well with a device_link or probe deferral based approach.
> > > > >
> > > > > Instead, perhaps we could go with the idea I mentioned previously:
> > > > > either introduce a notifier, or create a pseudo ffa_device
> > > > > once pKVM initialization has completed, and
> > > > > then let the ffa driver perform the additional initialization from there.
> > > > >
> > > > > Am I missing something?
> > > > >
> > > >
> > > > In order to handle/cleanup some ugliness in interrupt management in the
> > > > FF-A driver, we may introduce DT node eventually. But it will take sometime.
> > >
> > > Unfortunately, I think this DT node wouldn't be helpful to solve
> > > this situation for dependency with the kvm misc device...
> > >
> > > IMHO, current situation, the notifier seems to good option. unless
> > > we make the initcall to recongise this dependency.
> > >
> >
> > I think the best approach for now is to introduce a notifier to handle this situation.
> > If there are no further suggestions, I’ll send a v2 based on:
> >   - https://lore.kernel.org/all/aeS4rAeVQ0yJIPYw@e129823.arm.com/
>
> I can't say that I'm a huge fan of that :/
>
> The notifier will literally fire once, for a single listener. That's
> called a function call.


I revisited Marc’s suggestion about using device links
(https://lore.kernel.org/all/87pl3vb5bm.wl-maz@kernel.org/)

but unless I’m misunderstanding something, I don’t think it would be a viable solution:
 - https://lore.kernel.org/all/aen0j3qM2k06OdXC@e129823.arm.com/#t

Also, calling functions defined by FF-A from KVM would introduce
an unnecessary module dependency between the KVM and FF-A drivers.

I’ve been trying to find an alternative approach,
but I’m not confident about what would be appropriate.

If you don’t mind, could you share your thoughts on this?

Thanks!

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-23 11:01 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
	oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	will, noodles, sebastianene
In-Reply-To: <aem0SSQuE1e3pGOS@e129823.arm.com>

On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > Hi Mimi,
> > > > 
> > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > the TPM driver must be built as built-in and
> > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > 
> > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > 
> > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > the CRB interface is probed before IMA initialization,
> > > > > > the following conditions must be met:
> > > > > > 
> > > > > >    1. The corresponding ffa_device must be registered,
> > > > > >       which is done via ffa_init().
> > > > > > 
> > > > > >    2. The tpm_crb_driver must successfully probe this device via
> > > > > >       tpm_crb_ffa_init().
> > > > > > 
> > > > > >    3. The tpm_crb driver using CRB over FF-A can then
> > > > > >       be probed successfully. (See crb_acpi_add() and
> > > > > >       tpm_crb_ffa_init() for reference.)
> > > > > > 
> > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > 
> > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > has already been initialized, since IMA initialization is performed
> > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > at the same level.
> > > > > > 
> > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > log though TPM device presents in the system.
> > > > > > 
> > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > 
> > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > changes (e.g. ima_init_core).
> > > > > 
> > > > > Please just limit the change to just calling ima_init() twice.
> > > > 
> > > > My concern is that ima_update_policy_flags() will be called
> > > > when ima_init() is deferred -- not initialised anything.
> > > > though functionally, it might be okay however,
> > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > works logically.
> > > > 
> > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > ima_init_core() with some error handling.
> > > > 
> > > > Am I missing something?
> > > 
> > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > 
> > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > it by caller of ima_init().
> > 
> > Only tpm_default_chip() is being called to set the ima_tpm_chip.  On failure,
> > instead of going into TPM-bypass mode, return immediately.  There are no calls
> > to anything else.  Just call ima_init() a second time.
> 
> I’m not fully convinced this is sufficient.
> 
> What I meant is the case where ima_init() fails due to other
> initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).

The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
available at late_initcall.  This would be classified as a bug fix and would be
backported.  No other changes should be included in this patch.

> 
> If it fails at the late_initcall stage for such reasons, then we
> should not call ima_init() again at late_initcall_sync.
> 
> For this reason, instead of adding a static variable inside
> ima_init(), I think it would be better to manage the state in the
> caller and introduce something like an ima_initialised flag. Also, if
> initialisation fails for other reasons, the notifier block should be
> unregistered.

Defining a global file static variable, in lieu of a local static variable, is
fine. Defining two functions, one for late_initcall and another for
late_initcall_sync, that do nothing other than call ima_init() is also fine.
Please keep this patch as simple as possible.

> 
> I’d also like to ask again whether it is fine to call
> ima_update_policy_flags() and keep the notifier registered in the
> deferred TPM case. While this may be functionally acceptable, it seems
> logically questionable to do so when ima_init() has not completed.

Other than extending the TPM, IMA should behave exactly the same whether there
is a TPM or goes into TPM-bypass mode.

> 
> There is also a possibility that a deferred case ultimately fails (e.g.
> deferred at late_initcall, but then failing at late_initcall_sync
> for another reason, even while entering TPM bypass mode). In that case,
> it seems more appropriate to handle this state in the caller of
> ima_init(), rather than inside ima_init() itself.

If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
bypass mode.  Please don't make any other changes to the existing IMA behavior
and hide it here behind the late_initcall_sync change.

> 
> Am I still missing something?

When your original patch moved the initialization from late_initcall to
late_initcall_sync, you didn't question anything.  There's absolutely no
difference between that and calling ima_init twice, as long as on late_initcall
ima_init() returns immediately if the TPM chip isn't defined.

Any other changes are superfluous.  Keep the patch simple!

Mimi

^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Yeoreum Yun @ 2026-04-23 11:20 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
	oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	will, noodles, sebastianene
In-Reply-To: <56a8aab50a3b5ce0a345fc2079fb2abc7d0f1b23.camel@linux.ibm.com>

Hi Mimi,

> On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > > Hi Mimi,
> > > > >
> > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > > the TPM driver must be built as built-in and
> > > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > >
> > > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > >
> > > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > > the CRB interface is probed before IMA initialization,
> > > > > > > the following conditions must be met:
> > > > > > >
> > > > > > >    1. The corresponding ffa_device must be registered,
> > > > > > >       which is done via ffa_init().
> > > > > > >
> > > > > > >    2. The tpm_crb_driver must successfully probe this device via
> > > > > > >       tpm_crb_ffa_init().
> > > > > > >
> > > > > > >    3. The tpm_crb driver using CRB over FF-A can then
> > > > > > >       be probed successfully. (See crb_acpi_add() and
> > > > > > >       tpm_crb_ffa_init() for reference.)
> > > > > > >
> > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > >
> > > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > > has already been initialized, since IMA initialization is performed
> > > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > > at the same level.
> > > > > > >
> > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > > log though TPM device presents in the system.
> > > > > > >
> > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > >
> > > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > > changes (e.g. ima_init_core).
> > > > > >
> > > > > > Please just limit the change to just calling ima_init() twice.
> > > > >
> > > > > My concern is that ima_update_policy_flags() will be called
> > > > > when ima_init() is deferred -- not initialised anything.
> > > > > though functionally, it might be okay however,
> > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > > works logically.
> > > > >
> > > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > > ima_init_core() with some error handling.
> > > > >
> > > > > Am I missing something?
> > > >
> > > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > >
> > > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > > it by caller of ima_init().
> > >
> > > Only tpm_default_chip() is being called to set the ima_tpm_chip.  On failure,
> > > instead of going into TPM-bypass mode, return immediately.  There are no calls
> > > to anything else.  Just call ima_init() a second time.
> >
> > I’m not fully convinced this is sufficient.
> >
> > What I meant is the case where ima_init() fails due to other
> > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
>
> The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
> available at late_initcall.  This would be classified as a bug fix and would be
> backported.  No other changes should be included in this patch.

Okay.

> >
> > I’d also like to ask again whether it is fine to call
> > ima_update_policy_flags() and keep the notifier registered in the
> > deferred TPM case. While this may be functionally acceptable, it seems
> > logically questionable to do so when ima_init() has not completed.
>
> Other than extending the TPM, IMA should behave exactly the same whether there
> is a TPM or goes into TPM-bypass mode.
>
> >
> > There is also a possibility that a deferred case ultimately fails (e.g.
> > deferred at late_initcall, but then failing at late_initcall_sync
> > for another reason, even while entering TPM bypass mode). In that case,
> > it seems more appropriate to handle this state in the caller of
> > ima_init(), rather than inside ima_init() itself.
>
> If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
> bypass mode.  Please don't make any other changes to the existing IMA behavior
> and hide it here behind the late_initcall_sync change.

Okay. you're talking called ima_update_policy_flags() at late_initcall
wouldn't be not a problem even in case of late_initcall_sync's ima_init()
get failed with "TPM-bypass mode".

I see then, I'll make a patch simpler then.

Thanks.

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [PATCH] apparmor/lsm: Fix aa_dfa_unpack's error handling in aa_setup_dfa_engine
From: Georgia Garcia @ 2026-04-23 12:25 UTC (permalink / raw)
  To: GONG Ruiqi, John Johansen, Paul Moore, James Morris,
	Serge E . Hallyn
  Cc: apparmor, linux-security-module, linux-kernel, lujialin4,
	zhaoyipeng5
In-Reply-To: <20260423031056.563527-1-gongruiqi1@huawei.com>

On Thu, 2026-04-23 at 11:10 +0800, GONG Ruiqi wrote:
> aa_dfa_unpack returns ERR_PTR not NULL when it fails, but aa_put_dfa
> only checks NULL for its input, which would cause invalid memory access
> in aa_put_dfa. Set nulldfa to NULL explicitly to fix that.
> 

Thank you!

Acked-by: Georgia Garcia <georgia.garcia@canonical.com>

> Fixes: 98b824ff8984 ("apparmor: refcount the pdb")
> Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com>
> ---
>  security/apparmor/lsm.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> index c1d42fc72fdb..ead2f07982b6 100644
> --- a/security/apparmor/lsm.c
> +++ b/security/apparmor/lsm.c
> @@ -2465,6 +2465,7 @@ static int __init aa_setup_dfa_engine(void)
>  			    TO_ACCEPT2_FLAG(YYTD_DATA32));
>  	if (IS_ERR(nulldfa)) {
>  		error = PTR_ERR(nulldfa);
> +		nulldfa = NULL;
>  		goto fail;
>  	}
>  	nullpdb->dfa = aa_get_dfa(nulldfa);


^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Yeoreum Yun @ 2026-04-23 12:34 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
	oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	will, noodles, sebastianene
In-Reply-To: <aeoAlVEwzRUPrlVe@e129823.arm.com>

> > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > > > Hi Mimi,
> > > > > >
> > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > > > the TPM driver must be built as built-in and
> > > > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > > >
> > > > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > > >
> > > > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > > > the CRB interface is probed before IMA initialization,
> > > > > > > > the following conditions must be met:
> > > > > > > >
> > > > > > > >    1. The corresponding ffa_device must be registered,
> > > > > > > >       which is done via ffa_init().
> > > > > > > >
> > > > > > > >    2. The tpm_crb_driver must successfully probe this device via
> > > > > > > >       tpm_crb_ffa_init().
> > > > > > > >
> > > > > > > >    3. The tpm_crb driver using CRB over FF-A can then
> > > > > > > >       be probed successfully. (See crb_acpi_add() and
> > > > > > > >       tpm_crb_ffa_init() for reference.)
> > > > > > > >
> > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > > >
> > > > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > > > has already been initialized, since IMA initialization is performed
> > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > > > at the same level.
> > > > > > > >
> > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > > > log though TPM device presents in the system.
> > > > > > > >
> > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > > >
> > > > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > > > changes (e.g. ima_init_core).
> > > > > > >
> > > > > > > Please just limit the change to just calling ima_init() twice.
> > > > > >
> > > > > > My concern is that ima_update_policy_flags() will be called
> > > > > > when ima_init() is deferred -- not initialised anything.
> > > > > > though functionally, it might be okay however,
> > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > > > works logically.
> > > > > >
> > > > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > > > ima_init_core() with some error handling.
> > > > > >
> > > > > > Am I missing something?
> > > > >
> > > > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > > >
> > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > > > it by caller of ima_init().
> > > >
> > > > Only tpm_default_chip() is being called to set the ima_tpm_chip.  On failure,
> > > > instead of going into TPM-bypass mode, return immediately.  There are no calls
> > > > to anything else.  Just call ima_init() a second time.
> > >
> > > I’m not fully convinced this is sufficient.
> > >
> > > What I meant is the case where ima_init() fails due to other
> > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
> >
> > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
> > available at late_initcall.  This would be classified as a bug fix and would be
> > backported.  No other changes should be included in this patch.
>
> Okay.
>
> > >
> > > I’d also like to ask again whether it is fine to call
> > > ima_update_policy_flags() and keep the notifier registered in the
> > > deferred TPM case. While this may be functionally acceptable, it seems
> > > logically questionable to do so when ima_init() has not completed.
> >
> > Other than extending the TPM, IMA should behave exactly the same whether there
> > is a TPM or goes into TPM-bypass mode.
> >
> > >
> > > There is also a possibility that a deferred case ultimately fails (e.g.
> > > deferred at late_initcall, but then failing at late_initcall_sync
> > > for another reason, even while entering TPM bypass mode). In that case,
> > > it seems more appropriate to handle this state in the caller of
> > > ima_init(), rather than inside ima_init() itself.
> >
> > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
> > bypass mode.  Please don't make any other changes to the existing IMA behavior
> > and hide it here behind the late_initcall_sync change.
>
> Okay. you're talking called ima_update_policy_flags() at late_initcall
> wouldn't be not a problem even in case of late_initcall_sync's ima_init()
> get failed with "TPM-bypass mode".
>
> I see then, I'll make a patch simpler then.

But I think in case of below situation:
  - late_initcall's first ima_init() is deferred.
  - late_initcall_sync try again but failed and try again with
    CONFIG_IMA_DEFAULT_HASH.

I would like to sustain init_ima_core to reduce the same code repeat
in late_initcall_sync.

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Jonathan McDowell @ 2026-04-23 12:53 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: Mimi Zohar, linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
	dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
	oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
	will, noodles, sebastianene
In-Reply-To: <aeoRxWPyOHGJd+Jh@e129823.arm.com>

On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
>> > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
>> > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
>> > > > > > Hi Mimi,
>> > > > > >
>> > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
>> > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
>> > > > > > > > the TPM driver must be built as built-in and
>> > > > > > > > must be probed before the IMA subsystem is initialized.
>> > > > > > > >
>> > > > > > > > However, when the TPM device operates over the FF-A protocol using
>> > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
>> > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
>> > > > > > > > interface to the tpm_crb driver — has not yet been probed.
>> > > > > > > >
>> > > > > > > > To ensure the TPM device operating over the FF-A protocol with
>> > > > > > > > the CRB interface is probed before IMA initialization,
>> > > > > > > > the following conditions must be met:
>> > > > > > > >
>> > > > > > > >    1. The corresponding ffa_device must be registered,
>> > > > > > > >       which is done via ffa_init().
>> > > > > > > >
>> > > > > > > >    2. The tpm_crb_driver must successfully probe this device via
>> > > > > > > >       tpm_crb_ffa_init().
>> > > > > > > >
>> > > > > > > >    3. The tpm_crb driver using CRB over FF-A can then
>> > > > > > > >       be probed successfully. (See crb_acpi_add() and
>> > > > > > > >       tpm_crb_ffa_init() for reference.)
>> > > > > > > >
>> > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
>> > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
>> > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
>> > > > > > > >
>> > > > > > > > When this occurs, probing the TPM device is deferred.
>> > > > > > > > However, the deferred probe can happen after the IMA subsystem
>> > > > > > > > has already been initialized, since IMA initialization is performed
>> > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
>> > > > > > > > at the same level.
>> > > > > > > >
>> > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
>> > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
>> > > > > > > > log though TPM device presents in the system.
>> > > > > > > >
>> > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
>> > > > > > >
>> > > > > > > A lot of change for just detecting whether ima_init() is being called on
>> > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
>> > > > > > > changes (e.g. ima_init_core).
>> > > > > > >
>> > > > > > > Please just limit the change to just calling ima_init() twice.
>> > > > > >
>> > > > > > My concern is that ima_update_policy_flags() will be called
>> > > > > > when ima_init() is deferred -- not initialised anything.
>> > > > > > though functionally, it might be okay however,
>> > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
>> > > > > > works logically.
>> > > > > >
>> > > > > > This change I think not much quite a lot. just wrapper ima_init() with
>> > > > > > ima_init_core() with some error handling.
>> > > > > >
>> > > > > > Am I missing something?
>> > > > >
>> > > > > Also, if we handle in ima_init() only, but it failed with other reason,
>> > > > > we shouldn't call again ima_init() in the late_initcall_sync.
>> > > > >
>> > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
>> > > > > it by caller of ima_init().
>> > > >
>> > > > Only tpm_default_chip() is being called to set the ima_tpm_chip.  On failure,
>> > > > instead of going into TPM-bypass mode, return immediately.  There are no calls
>> > > > to anything else.  Just call ima_init() a second time.
>> > >
>> > > I’m not fully convinced this is sufficient.
>> > >
>> > > What I meant is the case where ima_init() fails due to other
>> > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
>> >
>> > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
>> > available at late_initcall.  This would be classified as a bug fix and would be
>> > backported.  No other changes should be included in this patch.
>>
>> Okay.
>>
>> > >
>> > > I’d also like to ask again whether it is fine to call
>> > > ima_update_policy_flags() and keep the notifier registered in the
>> > > deferred TPM case. While this may be functionally acceptable, it seems
>> > > logically questionable to do so when ima_init() has not completed.
>> >
>> > Other than extending the TPM, IMA should behave exactly the same whether there
>> > is a TPM or goes into TPM-bypass mode.
>> >
>> > >
>> > > There is also a possibility that a deferred case ultimately fails (e.g.
>> > > deferred at late_initcall, but then failing at late_initcall_sync
>> > > for another reason, even while entering TPM bypass mode). In that case,
>> > > it seems more appropriate to handle this state in the caller of
>> > > ima_init(), rather than inside ima_init() itself.
>> >
>> > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
>> > bypass mode.  Please don't make any other changes to the existing IMA behavior
>> > and hide it here behind the late_initcall_sync change.
>>
>> Okay. you're talking called ima_update_policy_flags() at late_initcall
>> wouldn't be not a problem even in case of late_initcall_sync's ima_init()
>> get failed with "TPM-bypass mode".
>>
>> I see then, I'll make a patch simpler then.
>
>But I think in case of below situation:
>  - late_initcall's first ima_init() is deferred.
>  - late_initcall_sync try again but failed and try again with
>    CONFIG_IMA_DEFAULT_HASH.
>
>I would like to sustain init_ima_core to reduce the same code repeat
>in late_initcall_sync.

I think what Mimi's proposing is:

If we're in late_initcall, and the TPM isn't available, return 
immediately with an error (the EPROBE_DEFER?), don't do any init.

If we're in late_initcall_sync, either we're already initialised, so do 
return and nothing, or run through the entire flow, even if the TPM 
isn't unavailable.

So ima_init() just needs to know a) if it's in the sync or non-sync mode 
and b) for the sync mode, if we've already done the init at
non-sync.

J.

-- 
... I'm not popular enough to be different.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox