Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Sudeep Holla @ 2026-04-20 15:47 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: Will Deacon, Marc Zyngier, linux-security-module, linux-kernel,
	Sudeep Holla, linux-integrity, linux-arm-kernel, kvmarm, paul,
	jmorris, serge, zohar, roberto.sassu, dmitry.kasatkin,
	eric.snowberg, peterhuewe, jarkko, jgg, oupton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, sebastianene
In-Reply-To: <aeYGeojpqcYAN5++@e129823.arm.com>

On Mon, Apr 20, 2026 at 11:56:58AM +0100, Yeoreum Yun wrote:
> Hi Will,
> 
> > [+Seb for the pKVM FFA bits]
> >
> > Ah sorry, I mixed up the ordering of 'module_init' vs 'rootfs_initcall'
> > and thought you wanted to probe the version earlier. But then I'm still
> > confused because, prior to 0e0546eabcd6 ("firmware: arm_ffa: Change
> > initcall level of ffa_init() to rootfs_initcall"), ffa_init() was a
> > 'device_initcall' which is still called earlier than finalize_pkvm().
> 
> Right, and this is what I missed when writing patch
> 0e0546eabcd6 ("firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall").
> and it still exists even if it's device call.
> 
> However, rather than changing ffa_init to rootfs_initcall, moving ima_init
> to late_initcall_sync is a better approach, as it also addresses similar
> issues for TPM devices that do not use FF-A. For this reason,
> the FF-A-related changes were reverted.
> 
> As a result, patch 4/4 addresses an issue that existed independently of
> 0e0546eabcd6, as you pointed out.
> 

I was not fully convinced by commit 0e0546eabcd6 ("firmware: arm_ffa: Change
initcall level of ffa_init() to rootfs_initcall"), and I had raised this
concern at the time. However, in the absence of a better alternative, we
proceeded with merging it.

My concern remains essentially the same. That change moved the initcall one
stage earlier, and now, by introducing `late_initcall_sync()`, we are
effectively shifting the dependency issue one stage later instead of resolving
it in a more fundamental way. From my perspective, this still relies on
adjusting initcall ordering as the primary means of making the dependency
work.

I do not think that is a robust or sustainable approach. Tweaking initcall
levels tends to be inherently fragile because it addresses the symptom through
sequencing rather than establishing a clear and explicit dependency model.

I also recall that `finalise_pkvm()` is itself at `device_initcall` level. If
that is correct, would this not introduce another ordering issue or at least
leave us exposed to similar dependency problems? That is exactly why I remain
uneasy about solving this by continuing to move initcalls backward or forward.

More broadly, the fact that we are revisiting the same class of issue again
after such a short time reinforces my concern that this direction is not
sufficiently stable. We may revisit it soon after we merge this approach.

-- 
Regards,
Sudeep

^ permalink raw reply

* Re: [RFC PATCH v1 00/11] Landlock: Namespace and capability control
From: Günther Noack @ 2026-04-20 15:06 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Paul Moore,
	Serge E . Hallyn, Justin Suess, Lennart Poettering,
	Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
	kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260312100444.2609563-1-mic@digikod.net>

Hello!

On Thu, Mar 12, 2026 at 11:04:33AM +0100, Mickaël Salaün wrote:
> Namespaces are a fundamental building block for containers and
> application sandboxes, but user namespace creation significantly widens
> the kernel attack surface.  CVE-2022-0185 (filesystem mount parsing),
> CVE-2022-25636 and CVE-2023-32233 (netfilter), and CVE-2022-0492 (cgroup
> v1 release_agent) all demonstrate vulnerabilities exploitable only
> through capabilities gained via user namespaces.  Some distributions
> block user namespace creation entirely, but this removes a useful
> isolation primitive.  Fine-grained control allows trusted programs to
> use namespaces while preventing unnecessary exposure for programs that
> do not need them.
> 
> Existing mechanisms (user.max_*_namespaces sysctls, userns_create LSM
> hook, PR_SET_NO_NEW_PRIVS, and capset) each address part of this threat
> but none provides per-process, fine-grained control over both namespace
> types and capabilities.  Container runtimes resort to seccomp-based
> clone/unshare filtering, but seccomp cannot dereference clone3's flag
> structure, forcing runtimes to block clone3 entirely.
> 
> Landlock's composable layer model enables several patterns: a user
> session manager can restrict namespace types and capabilities broadly
> while allowing trusted programs to create the namespaces they need, and
> each deeper layer can further restrict the allowed set.  Container
> runtimes can similarly deny namespace creation inside managed
> containers.

I assume we are talking about an unrestricted systemd user session
manager, which would not itself be restricted?  (If the entire user
session were running under Landlock, users couldn't change their
passwords with "passwd" any more, because of the no_new_privs
requirement.)


> This series adds two new permission categories to Landlock:
> 
> - LANDLOCK_PERM_NAMESPACE_ENTER: Restricts which namespace types a
>   sandboxed process can acquire: both creation (unshare/clone) and entry
>   (setns).  User namespace creation has no capability check in the
>   kernel, so this is the only enforcement mechanism for that entry
>   point.
> 
> - LANDLOCK_PERM_CAPABILITY_USE: Restricts which Linux capabilities a
>   sandboxed process can use, regardless of how they were obtained
>   (including through user namespace creation).

Given that you already went through multiple iterations here, I fully
expect that I am overlooking something here, but based on the
explanation, it's not clear to me why the capability control is needed
in addition to the namespace control, to reduce the kernel attack
surface.

In my understanding the "attack surface" problem with user namespaces
is that they allow unprivileged processes to gain CAP_SYS_ADMIN within
that namespace, which unlocks access to code paths which were
traditionally reserved for the (top level) root user.

But then, to prevent that from happening, it seems that restricting
access to user namespace creation would be sufficient?

(Also, in some cases, I suspect it might be possible to break
assumptions that more privileged processes make about filesystem
layout if the user can change the mount layout.  But that is not an
issue with Landlock, as we forbid changes to mounts and also require
no_new_privs.)


> Both use new handled_perm and LANDLOCK_RULE_* constants following the
> existing allow-list model.  The UAPI uses raw CAP_* and CLONE_NEW*
> values directly; unknown values are silently accepted for forward
> compatibility (the allow-list denies them by default).  The Landlock ABI
> version is bumped from 8 to 9.

Compatibility question:

For both permission categories, when they are "handled" in the
ruleset, they default to denying *all* types of namespaces, and *all*
types of capabilities.

This is different to the handled_access_* rights, where we are
requiring users to explicitly list all restricted rights as "handled",
because the full list of available operations might be a moving
target.

Why is this not a problem for capabilities and for namespaces?  Both
the list of capabilities and the list of namespaces has been expanded
in the past.  What happens if a new capability or namespace is
invented?  If these are evolved, is that backwards compatible for the
existing users of these Landlock permission categories?


> The handled_perm infrastructure is designed to be reusable by future
> permission categories.  The last patch documents the design rationale
> for the permission model and the criteria for choosing between
> handled_access_*, handled_perm, and scoped.  A patch series to add
> socket creation control is under review [2]; it could benefit from the
> same permission model to achieve complete deny-by-default coverage of
> socket creation.
> 
> This series builds on Christian Brauner's namespace LSM blob RFC [1],
> included as patch 1.
> 
> Christian, could you please review patch 3?  It adds a FOR_EACH_NS_TYPE
> X-macro to ns_common_types.h and derives CLONE_NS_ALL, replacing inline
> CLONE_NEW* flag enumerations in nsproxy.c and fork.c.
> 
> Paul, could you please review patch 2?  It adds LSM_AUDIT_DATA_NS, a new
> audit record type that logs namespace_type and inum for
> namespace-related LSM denials.
> 
> All four example vulnerabilities follow the same pattern: an
> unprivileged user creates a user namespace to obtain capabilities, then
> creates a second namespace to exercise them against vulnerable code.
> LANDLOCK_PERM_NAMESPACE_ENTER prevents this by denying the user
> namespace (eliminating the capability grant) or the specific namespace
> type needed to exercise it.  LANDLOCK_PERM_CAPABILITY_USE independently
> prevents it by denying the required capability.

Here, it is also not clear to me why LANDLOCK_PERM_CAPABILITY_USE is
needed in addition to LANDLOCK_PERM_NAMESPACE_ENTER.

Looking at capabilities(7), my understanding is that capabilities can
only be acquired through:

(1) user namespaces (prevented with LANDLOCK_PERM_NAMESPACE_ENTER)
(2) execve (setuid or individual capabilities, prevented using
    PR_SET_NO_NEW_PRIVS)

...so if a process were to start out with no such capabilities,
wouldn't that be enough to prevent it from gaining more?  Am I
overlooking another way through which these can be acquired?

The Landlock capability support adds a "filter" for the use of
capabilities, but my understanding of the capability system was that
it already *is* that filter.  As long as we prevent the acquisition of
new capabilities, shouldn't that be sufficient?


> Namespace restriction is enforced at two hook sites: namespace_alloc
> (unshare/clone) and namespace_install (setns).  Together, these ensure a
> process denied a namespace type cannot circumvent the restriction by
> entering a pre-existing namespace via setns() on an inherited or passed
> file descriptor.  When a domain handles both permissions, both must
> independently allow the operation (e.g., unshare(CLONE_NEWNET) requires
> both CAP_SYS_ADMIN to be allowed and CLONE_NEWNET to be allowed).
> 
> Design evolution:
> 
> The first approach added CAP_OPT flags to security_capable() to
> distinguish namespace creation contexts.  This was too invasive and
> would have required capability splitting (a dedicated CAP_NAMESPACE)
> which does not help because the CAP_SYS_ADMIN fallback for backward
> compatibility undermines the distinction.
> 
> The second stored the namespace creator's domain in the LSM blob and
> used domain ancestry comparison in hook_capable() to bypass capability
> checks for namespace management operations.  A SCOPE_NAMESPACE flag
> restricted setns() by the namespace creator's domain, like SCOPE_SIGNAL.
> Both were dropped: scopes should only concern Landlock properties
> (domain relationships), not kernel namespace state; and the
> cross-namespace heuristic (ns != cred->user_ns) did not accurately
> identify namespace management operations.
> 
> The final design drops all of this.  The key insight is that
> capabilities gained through user namespace creation are only exercisable
> against namespaces of a specific type: creating a network namespace is
> what makes CAP_NET_ADMIN exercisable.  LANDLOCK_PERM_NAMESPACE_ENTER
> controls where capabilities are exercisable by restricting which
> namespace types can be acquired.  LANDLOCK_PERM_CAPABILITY_USE controls
> which capabilities are available, as a pure per-layer bitmask check with
> no namespace awareness.  The two are independently enforced at their own
> hook sites, with no interaction in hook_capable().  No scope flag is
> added in this series.
> 
> Note that when Landlock filesystem restrictions are in use, mount
> namespace creation has an inherent limitation: all mount topology
> changes are denied when any filesystem right is handled, which is
> optional.  A dedicated mount access control type is left for future work
> [3].
> 
> https://lore.kernel.org/r/20260216-work-security-namespace-v1-1-075c28758e1f@kernel.org [1]
> https://lore.kernel.org/r/20251118134639.3314803-1-ivanov.mikhail1@huawei-partners.com [2]
> https://github.com/landlock-lsm/linux/issues/14 [3]
> 
> Christian Brauner (1):
>   security: add LSM blob and hooks for namespaces
> 
> Mickaël Salaün (10):
>   security: Add LSM_AUDIT_DATA_NS for namespace audit records
>   nsproxy: Add FOR_EACH_NS_TYPE() X-macro and CLONE_NS_ALL
>   landlock: Wrap per-layer access masks in struct layer_rights
>   landlock: Enforce namespace entry restrictions
>   landlock: Enforce capability restrictions
>   selftests/landlock: Drain stale audit records on init
>   selftests/landlock: Add namespace restriction tests
>   selftests/landlock: Add capability restriction tests
>   samples/landlock: Add capability and namespace restriction support
>   landlock: Add documentation for capability and namespace restrictions
> 
>  Documentation/admin-guide/LSM/landlock.rst    |   19 +-
>  Documentation/security/landlock.rst           |   80 +-
>  Documentation/userspace-api/landlock.rst      |  156 +-
>  include/linux/lsm_audit.h                     |    5 +
>  include/linux/lsm_hook_defs.h                 |    3 +
>  include/linux/lsm_hooks.h                     |    1 +
>  include/linux/ns/ns_common_types.h            |   47 +-
>  include/linux/security.h                      |   20 +
>  include/uapi/linux/landlock.h                 |   89 +-
>  kernel/fork.c                                 |    7 +-
>  kernel/nscommon.c                             |   12 +
>  kernel/nsproxy.c                              |   21 +-
>  samples/landlock/sandboxer.c                  |  164 +-
>  security/landlock/Makefile                    |    2 +
>  security/landlock/access.h                    |   72 +-
>  security/landlock/audit.c                     |    8 +
>  security/landlock/audit.h                     |    2 +
>  security/landlock/cap.c                       |  142 ++
>  security/landlock/cap.h                       |   49 +
>  security/landlock/cred.h                      |   47 +-
>  security/landlock/limits.h                    |    9 +
>  security/landlock/ns.c                        |  188 +++
>  security/landlock/ns.h                        |   74 +
>  security/landlock/ruleset.c                   |   23 +-
>  security/landlock/ruleset.h                   |   53 +-
>  security/landlock/setup.c                     |    4 +
>  security/landlock/syscalls.c                  |  124 +-
>  security/lsm_audit.c                          |    4 +
>  security/lsm_init.c                           |    2 +
>  security/security.c                           |   76 +
>  tools/testing/selftests/landlock/audit.h      |   29 +-
>  tools/testing/selftests/landlock/audit_test.c |    2 -
>  tools/testing/selftests/landlock/base_test.c  |   20 +-
>  tools/testing/selftests/landlock/cap_test.c   |  614 ++++++++
>  tools/testing/selftests/landlock/common.h     |   23 +
>  tools/testing/selftests/landlock/config       |    5 +
>  tools/testing/selftests/landlock/ns_test.c    | 1379 +++++++++++++++++
>  tools/testing/selftests/landlock/wrappers.h   |    6 +
>  38 files changed, 3487 insertions(+), 94 deletions(-)
>  create mode 100644 security/landlock/cap.c
>  create mode 100644 security/landlock/cap.h
>  create mode 100644 security/landlock/ns.c
>  create mode 100644 security/landlock/ns.h
>  create mode 100644 tools/testing/selftests/landlock/cap_test.c
>  create mode 100644 tools/testing/selftests/landlock/ns_test.c
> 
> 
> base-commit: 5dfb8077be2bbe2c3b9477da759e80fa9f98da42
> -- 
> 2.53.0
> 

FWIW, I have also skimmed through some of the code and documentation
and the code seemed very clean so far.

–Günther

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Yeoreum Yun @ 2026-04-20 15:04 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: Marc Zyngier, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, serge,
	zohar, roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe,
	jarkko, jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas, will
In-Reply-To: <aeY2M3v97c00JjFe@google.com>

On Mon, Apr 20, 2026 at 02:20:35PM +0000, Sebastian Ene wrote:
> On Mon, Apr 20, 2026 at 01:46:47PM +0100, Marc Zyngier wrote:
> > On Mon, 20 Apr 2026 13:32:32 +0100,
> > Sebastian Ene <sebastianene@google.com> wrote:
> > >
> > > On Fri, Apr 17, 2026 at 06:57:59PM +0100, Yeoreum Yun wrote:
> > >
> > > Hello Yeoreum,
> > >
> > >
> > > > When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> > > > Otherwise, pKVM cannot negotiate the FF-A version or
> > > > obtain RX/TX buffer information, leading to failures in FF-A calls.
> > >
> > > At the moment this already happens after you move back ffa_init() to
> > > device_initcall().
> >
> > But relying on this sort of ordering is just making things more
> > fragile.
> >
>
> Thanks for letting me know. Since this is not a solid construct we will have
> to change the driver init code to come after pKVM in this case.
>
> > >
> > > >
> > > > During FF-A driver initialization, check whether pKVM has been initialized.
> > > > If not, defer probing of the FF-A driver.
> > > >
> > >
> > > I don't think you need to add this dependency. pKVM is
> > > installed through KVM's module_init() which ends up calling hyp_ffa_init() to
> > > do the proxy initialization. The ARM-FFA driver comes after it (since
> > > pKVM is arch specific code). We don't have to call finalize_pkvm(..) to
> > > be able to handle smc(FF-A) calls in the hyp-proxy.
> >
> > You do. Without the finalisation, SMCs are not trapped by EL2.
> >
> > And even if it did, relying on such hack is just wrong.
> >
>
> That makes it an even stronger argument to move the driver init at a
> later stage. I was relying on this to trap early ff-a when the
> ARM FF-A driver was used.


I don’t think moving the FF-A driver initialization to a later stage is
a viable solution. For example, even if it is moved to device_initcall_sync,
it still relies on fragile ordering.

Similarly, moving it to late_initcall is problematic.
Since deferred_probe_initcall() runs at the same level, if it is invoked first,
devices that depend on FF-A (e.g. tpm_ffa_crb) may not be probed correctly,
leading to deferred devices not being handled properly.

Therefore, the FF-A driver should be able to detect when pKVM has been
initialized and perform its initialization accordingly otherwise,
just relying on the trap after kvm_arm_initialised.

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Yeoreum Yun @ 2026-04-20 14:47 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas, will
In-Reply-To: <aeYypF4lv6LMH2ch@google.com>

Hi Sebastian,

> On Mon, Apr 20, 2026 at 02:00:57PM +0100, Yeoreum Yun wrote:
>
> Hi,
>
> >
> > Hi Sebastian,
> > > On Fri, Apr 17, 2026 at 06:57:59PM +0100, Yeoreum Yun wrote:
> > >
> > > Hello Yeoreum,
> > >
> > >
> > > > When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> > > > Otherwise, pKVM cannot negotiate the FF-A version or
> > > > obtain RX/TX buffer information, leading to failures in FF-A calls.
> > >
> > > At the moment this already happens after you move back ffa_init() to
> > > device_initcall().
> >
> > How? the kvm_arm_init() is device_initcall() if both built as built-in.
> >
> > >
> > > >
> > > > During FF-A driver initialization, check whether pKVM has been initialized.
> > > > If not, defer probing of the FF-A driver.
> > > >
> > >
> > > I don't think you need to add this dependency. pKVM is
> > > installed through KVM's module_init() which ends up calling hyp_ffa_init() to
> > > do the proxy initialization. The ARM-FFA driver comes after it (since
> > > pKVM is arch specific code). We don't have to call finalize_pkvm(..) to
> > > be able to handle smc(FF-A) calls in the hyp-proxy.
> > >
> >
> > As Marc said, the before finalised_pkvm(), smc wouldn't be trapped
> > to pKVM. IOW, in case when both built as built-in,
>
> They are, I tested before replying to this thread. The HCR_EL2 is
> 0x480080000 so HCR_EL2 TSC bit is set so SMC/FF-A and trapping is enabled.

Oh. I've missed cpu_init_hyp_mode() sets up HCR_EL2. So you're right.
Thanks to correct me ;)

>
> In __pkvm_prot_finalize it sets the HCR_VM bit which enables stage-2 and
> then write the HCR_EL2 from params->hcr_el2. However I wasn't sure that
> this is seen as a 'hack' and not expected to work.
>
> > if ffa_init() is called before finalised_pkvm(),
> > it couldn't proxy the FFA_VERSION, FFA_RXTX_MAP and FFA_PARTITION_INFO_GET
> > called by ffa_init().
> >
> > How can you gurantee hyp_ffa_init() which is called by kvm_arm_init()
> > comes first even kvm_arm_init() and ffa_init() are on device_initcall?
> >
>
> While they are both on device_initcall, the only difference is that
> kvm_arm_init is arch code which appears before the driver/ code in the
> linker. That's why Marc said it is not a solid construct to rely on
> this.

Then I think the origin one -- just check kvm_arm_initialised
is enough to check in ffa_driver. since I misunderstood TSC bit
is setup after finalised_pkvm().

or Am I missing something?

Thanks.

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Sebastian Ene @ 2026-04-20 14:20 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Yeoreum Yun, linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will
In-Reply-To: <86mryx2408.wl-maz@kernel.org>

On Mon, Apr 20, 2026 at 01:46:47PM +0100, Marc Zyngier wrote:
> On Mon, 20 Apr 2026 13:32:32 +0100,
> Sebastian Ene <sebastianene@google.com> wrote:
> > 
> > On Fri, Apr 17, 2026 at 06:57:59PM +0100, Yeoreum Yun wrote:
> > 
> > Hello Yeoreum,
> > 
> > 
> > > When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> > > Otherwise, pKVM cannot negotiate the FF-A version or
> > > obtain RX/TX buffer information, leading to failures in FF-A calls.
> > 
> > At the moment this already happens after you move back ffa_init() to
> > device_initcall().
> 
> But relying on this sort of ordering is just making things more
> fragile.
> 

Thanks for letting me know. Since this is not a solid construct we will have
to change the driver init code to come after pKVM in this case.

> > 
> > > 
> > > During FF-A driver initialization, check whether pKVM has been initialized.
> > > If not, defer probing of the FF-A driver.
> > > 
> > 
> > I don't think you need to add this dependency. pKVM is
> > installed through KVM's module_init() which ends up calling hyp_ffa_init() to
> > do the proxy initialization. The ARM-FFA driver comes after it (since
> > pKVM is arch specific code). We don't have to call finalize_pkvm(..) to
> > be able to handle smc(FF-A) calls in the hyp-proxy.
> 
> You do. Without the finalisation, SMCs are not trapped by EL2.
> 
> And even if it did, relying on such hack is just wrong.
> 

That makes it an even stronger argument to move the driver init at a
later stage. I was relying on this to trap early ff-a when the
ARM FF-A driver was used.

> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

Thanks,
Sebastian

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Sebastian Ene @ 2026-04-20 14:05 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas, will
In-Reply-To: <aeYjiaPtAl7SMVkL@e129823.arm.com>

On Mon, Apr 20, 2026 at 02:00:57PM +0100, Yeoreum Yun wrote:

Hi,

> 
> Hi Sebastian,
> > On Fri, Apr 17, 2026 at 06:57:59PM +0100, Yeoreum Yun wrote:
> >
> > Hello Yeoreum,
> >
> >
> > > When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> > > Otherwise, pKVM cannot negotiate the FF-A version or
> > > obtain RX/TX buffer information, leading to failures in FF-A calls.
> >
> > At the moment this already happens after you move back ffa_init() to
> > device_initcall().
> 
> How? the kvm_arm_init() is device_initcall() if both built as built-in.
> 
> >
> > >
> > > During FF-A driver initialization, check whether pKVM has been initialized.
> > > If not, defer probing of the FF-A driver.
> > >
> >
> > I don't think you need to add this dependency. pKVM is
> > installed through KVM's module_init() which ends up calling hyp_ffa_init() to
> > do the proxy initialization. The ARM-FFA driver comes after it (since
> > pKVM is arch specific code). We don't have to call finalize_pkvm(..) to
> > be able to handle smc(FF-A) calls in the hyp-proxy.
> >
> 
> As Marc said, the before finalised_pkvm(), smc wouldn't be trapped
> to pKVM. IOW, in case when both built as built-in,

They are, I tested before replying to this thread. The HCR_EL2 is
0x480080000 so HCR_EL2 TSC bit is set so SMC/FF-A and trapping is enabled.

In __pkvm_prot_finalize it sets the HCR_VM bit which enables stage-2 and
then write the HCR_EL2 from params->hcr_el2. However I wasn't sure that
this is seen as a 'hack' and not expected to work.

> if ffa_init() is called before finalised_pkvm(),
> it couldn't proxy the FFA_VERSION, FFA_RXTX_MAP and FFA_PARTITION_INFO_GET
> called by ffa_init().
> 
> How can you gurantee hyp_ffa_init() which is called by kvm_arm_init()
> comes first even kvm_arm_init() and ffa_init() are on device_initcall?
> 

While they are both on device_initcall, the only difference is that
kvm_arm_init is arch code which appears before the driver/ code in the
linker. That's why Marc said it is not a solid construct to rely on
this. 


Thanks,
Sebastian 

> [...]
> 
> Thanks
> 
> 
> --
> Sincerely,
> Yeoreum Yun

^ permalink raw reply

* [PATCH AUTOSEL 7.0-6.18] ima: Define and use a digest_size field in the ima_algo_desc structure
From: Sasha Levin @ 2026-04-20 13:18 UTC (permalink / raw)
  To: patches, stable
  Cc: Roberto Sassu, Mimi Zohar, Sasha Levin, dmitry.kasatkin, paul,
	jmorris, serge, linux-integrity, linux-security-module,
	linux-kernel
In-Reply-To: <20260420132314.1023554-1-sashal@kernel.org>

From: Roberto Sassu <roberto.sassu@huawei.com>

[ Upstream commit a74d7197ebe5b1b8028911d47e78c119d9aaf193 ]

Add the digest_size field to the ima_algo_desc structure to determine the
digest size from the correct source.

If the hash algorithm is among allocated PCR banks, take the value from the
TPM bank info (equal to the value from the crypto subsystem if the TPM
algorithm is supported by it; otherwise, not exceding the size of the
digest buffer in the tpm_digest structure, used by IMA).

If the hash algorithm is SHA1, use the predefined value. Lastly, if the
hash algorithm is the default one but not among the PCR banks, take the
digest size from the crypto subsystem (the default hash algorithm is
checked when parsing the ima_hash= command line option).

Finally, use the new information to correctly show the template digest in
ima_measurements_show() and ima_ascii_measurements_show().

Link: https://github.com/linux-integrity/linux/issues/14
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have complete information. Let me provide the final analysis.

## Phase 1: Commit Message Forensics

### Step 1.1: Subject Line
**Record:** Subsystem: `ima:` (Integrity Measurement Architecture, a
security subsystem). Action verb: "Define and use" - sounds like
enhancement, but functionally equivalent to "fix digest size source".
One-line summary: Add a digest_size field to ima_algo_desc for accurate
template digest output.

### Step 1.2: Tags
**Record:**
- `Link: https://github.com/linux-integrity/linux/issues/14` (bug
  tracker for the linux-integrity subsystem)
- `Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>` (author)
- `Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>` (IMA maintainer -
  applied the patch)
- No `Fixes:`, no `Cc: stable`, no `Reported-by:` tag.

### Step 1.3: Commit Body Analysis
**Record:** The message describes three cases for digest size:
1. TPM bank-allocated algos: take from TPM bank info (which may differ
   from crypto subsystem size)
2. SHA1: use predefined value
3. Default hash algo not among banks: use crypto subsystem's size

Author's framing is additive/improvement ("Add the ... field"), but the
Link points to GitHub issue #14 titled "Out of bound when creating per-
algo measurement list interfaces" - describing a KASAN out-of-bounds
read when TPM has unsupported algorithms (e.g., SHA3_256).

### Step 1.4: Hidden Bug Fix Detection
**Record:** This IS a hidden bug fix. The old code used
`hash_digest_size[algo]` where `algo` can be `HASH_ALGO__LAST` (for
unsupported TPM algos). Since `hash_digest_size` is declared
`[HASH_ALGO__LAST]`, that access is out-of-bounds. The new code uses the
TPM bank's `digest_size` (always valid) or a known constant.

## Phase 2: Diff Analysis

### Step 2.1: Inventory
**Record:** 3 files changed:
- `security/integrity/ima/ima.h` (+1)
- `security/integrity/ima/ima_crypto.c` (+6)
- `security/integrity/ima/ima_fs.c` (+6/-12)

Total: 13 insertions, 12 deletions. Scope: single-subsystem surgical
change.

### Step 2.2-2.3: Code Flow and Bug Mechanism
**Record:** Bug category: **Out-of-bounds read** (KASAN-detectable).

Before fix: `ima_putc(m, e->digests[algo_idx].digest,
hash_digest_size[algo])` where `algo = ima_algo_array[algo_idx].algo`.
If the TPM has an algorithm not supported by the kernel's crypto
subsystem (e.g., SHA3_256 which was not yet in `tpm2_hash_map`), `algo
== HASH_ALGO__LAST`, and `hash_digest_size[HASH_ALGO__LAST]` is an OOB
read of the `[HASH_ALGO__LAST]`-sized array.

After fix: `ima_putc(m, e->digests[algo_idx].digest,
ima_algo_array[algo_idx].digest_size)`. `digest_size` is populated from
`tpm_bank_info.digest_size` (which is filled via `tpm2_pcr_read` for
unknown algos, or `hash_digest_size[crypto_algo]` for known ones),
`SHA1_DIGEST_SIZE`, or `hash_digest_size[ima_hash_algo]` - all safe
indexes.

### Step 2.4: Fix Quality
**Record:** Fix is obviously correct, minimal, and well-contained. The
new `digest_size` field is populated once during init (`__init`), then
only read later. Regression risk is low - the change is semantically
equivalent to the old code when the TPM algo is supported, and correct
when it isn't.

## Phase 3: Git History

### Step 3.1-3.2: Blame and Fixes target
**Record:** The buggy line `ima_putc(m, e->digests[algo_idx].digest,
hash_digest_size[algo])` was introduced by commit `9fa8e76250082a`
("ima: add crypto agility support for template-hash algorithm", by
Enrico Bravi, merged in v6.10). This code is present in every stable
tree from v6.10 onwards (so 6.12.y and newer).

### Step 3.3: Related Commits
**Record:** Companion commit `d7bd8cf0b348d` ("ima_fs: Correctly create
securityfs files for unsupported hash algos") was applied 12 days after
this one, sharing the same `Link:` to issue #14. That commit has an
explicit `Fixes: 9fa8e7625008` tag and includes a KASAN dump showing
`create_securityfs_measurement_lists+0x396/0x440` OOB in
`hash_algo_name`. The two commits address two sides of the same bug:
`a74d7197ebe5b` fixes OOB in `hash_digest_size[algo]` (runtime, at file
read), `d7bd8cf0b348d` fixes OOB in `hash_algo_name[algo]` (boot, at
file creation).

### Step 3.4: Author Context
**Record:** Roberto Sassu is a long-term IMA contributor. Mimi Zohar is
the IMA subsystem maintainer who merged the patch.

### Step 3.5: Dependencies
**Record:** The fix depends on `tpm_bank_info.digest_size` being
available, which has existed since commit `879b589210a9a` (2019). No new
dependencies. Applies to any stable tree containing `9fa8e76250082a`
(v6.10+).

## Phase 4: Mailing List Research

### Step 4.1-4.4: Patch Discussion
**Record:**
- `b4 dig -c a74d7197ebe5b` found single v1 submission at `https://lore.
  kernel.org/all/20260225125301.87996-1-roberto.sassu@huaweicloud.com/`
- Discussion thread contains 3 messages from Mimi Zohar (maintainer) and
  Roberto Sassu. Mimi requested title rename and asked for a note about
  the design change (from crypto subsystem's digest size to TPM's).
- No explicit stable nomination, no mention of KASAN in discussion
  thread itself.
- GitHub issue #14 (referenced via Link: tag) explicitly documents the
  OOB bug this is fixing: "If a TPM algorithm is not supported the PCR
  bank info is initialized with HASH_ALGO__LAST, which passed to
  hash_algo_name[] causes an out of bound."
- No v2, applied as single revision.

### Step 4.5: Stable Discussion
**Record:** No prior stable mailing list discussion found for this
specific commit.

## Phase 5: Code Semantic Analysis

### Step 5.1-5.4: Call Paths
**Record:** `ima_measurements_show()` is called when a userspace process
reads `/sys/kernel/security/ima/binary_runtime_measurements*`.
`ima_ascii_measurements_show()` similarly for ASCII files. These files
are readable by root. The path is reachable from userspace via a simple
`read()` syscall against the securityfs files. `ima_init_crypto()` is
called once at boot via initcall.

### Step 5.5: Similar Patterns
**Record:** The sister commit `d7bd8cf0b348d` addresses the same pattern
(`hash_algo_name[algo]` with `algo == HASH_ALGO__LAST`) in the file-
creation path.

## Phase 6: Stable Tree Cross-Reference

### Step 6.1-6.3: Applicability
**Record:**
- Buggy code exists in 6.12.y (verified via `git blame stable-
  push/linux-6.12.y` showing line 184 originated from 9fa8e76250082a).
  Also in 6.15, 6.17, 6.18, 6.19, 7.0.
- 6.1.y and 6.6.y don't have the crypto agility code
  (`hash_digest_size[algo]` usage) - the fix is NOT applicable/needed
  there. 6.6.y uses `TPM_DIGEST_SIZE`.
- Backport difficulty to 6.12.y: minor rework needed (ima_algo_array
  allocation uses `kcalloc` instead of `kzalloc_objs` in newer tree, but
  that's not affected by this patch - the field addition and assignments
  apply straightforwardly).
- Neither this commit nor `d7bd8cf0b348d` is yet in 6.12.y (verified via
  `git log stable-push/linux-6.12.y`).

## Phase 7: Subsystem Context

### Step 7.1-7.2
**Record:** Subsystem: IMA (security/integrity/ima/). Criticality:
IMPORTANT - used for measured boot/attestation on enterprise/embedded
systems. Activity: active subsystem with regular fixes. The code is only
reachable when CONFIG_IMA is enabled AND a TPM is present, further
narrowing impact to TPM-equipped systems.

## Phase 8: Impact and Risk

### Step 8.1: Affected Users
**Record:** Users with IMA enabled + TPM 2.0 chip that exposes an
algorithm not in the kernel's `tpm2_hash_map`. The KASAN dump in
d7bd8cf0b348d shows this was hit on real hardware (SHA3_256-capable
TPM).

### Step 8.2: Trigger
**Record:** The secondary OOB fixed by THIS commit
(hash_digest_size[HASH_ALGO__LAST]) triggers when:
1. A TPM exposes an unsupported algorithm (e.g., SHA3_256)
2. A user (root) reads the unsupported-algo measurements file
Root privilege required - not a remote attack vector, but reproducible
with specific hardware. The primary OOB (in create_securityfs) hits
every boot with such TPMs, which is what the KASAN report showed.

### Step 8.3: Failure Mode
**Record:** Out-of-bounds read from kernel memory. Under KASAN: reported
as BUG. Without KASAN: may return garbage digest size, which could cause
excessive data to be read from `e->digests[algo_idx].digest` (a fixed-
size `[TPM2_MAX_DIGEST_SIZE]` buffer) or leak a few bytes past the
`hash_digest_size` array. Severity: **MEDIUM-HIGH** (OOB read is KASAN-
reportable security-relevant behavior, not a guaranteed crash without
KASAN but can leak info or cause incorrect behavior).

### Step 8.4: Risk vs Benefit
**Record:**
- **Benefit: MEDIUM** - Fixes one half of a KASAN-reportable OOB read
  with real-hardware reproducer.
- **Risk: LOW** - 13-line structural change, all within the IMA init
  path + two show functions, no change of external behavior for
  supported TPM algos.
- **Ratio: Favorable for backport** - but only valuable when paired with
  d7bd8cf0b348d (the boot-time crash fix).

## Phase 9: Final Synthesis

### Step 9.1-9.4: Evidence Summary

**For backport:**
- Small, contained (13/12 lines, 3 files)
- Fixes real OOB read (hash_digest_size[HASH_ALGO__LAST])
- Obviously correct - reviewed by IMA maintainer (Mimi Zohar)
- Low regression risk
- Reachable from userspace (root reads securityfs file)
- Bug has real-hardware reproducer (SHA3_256 TPMs)
- Companion commit d7bd8cf0b348d has `Fixes:` tag and will be auto-
  selected; backporting only d7bd8cf0b348d leaves a latent OOB in the
  read path

**Against backport:**
- No explicit `Fixes:` tag, no `Cc: stable`
- Framed as enhancement, not bug fix
- Alone doesn't fix the primary crash (boot-time OOB in
  `create_securityfs_measurement_lists`) - that's d7bd8cf0b348d
- Design change (TPM's size vs crypto's size) noted by maintainer in
  review

**Exception Category:** Not a device ID/quirk/DT/build/doc. Standard bug
fix evaluation.

**Stable Rules Check:**
1. Obviously correct: YES (reviewed, simple struct field addition + safe
   sources)
2. Fixes real bug: YES (OOB read)
3. Important issue: MEDIUM (KASAN-reportable OOB with real hardware)
4. Small and contained: YES (13/12 lines)
5. No new features/APIs: YES (internal struct field, not user-visible)
6. Applies to stable: YES (6.12.y+ with minor/no conflicts)

## Verification

- [Phase 1] Parsed tags: no Fixes:, no Cc: stable, has `Link:
  github.com/linux-integrity/linux/issues/14`
- [Phase 1] Hidden bug fix confirmed: commit uses additive language but
  eliminates OOB read
- [Phase 2] Diff analysis: `git show a74d7197ebe5b` confirmed +13/-12
  across 3 files; key change is replacing `hash_digest_size[algo]` with
  `ima_algo_array[algo_idx].digest_size`
- [Phase 3] `git blame stable-push/linux-6.12.y
  security/integrity/ima/ima_fs.c`: buggy line 184 was introduced by
  9fa8e76250082a (v6.10)
- [Phase 3] `git show 9fa8e76250082`: confirmed merged in v6.10
  (2024-04-12)
- [Phase 3] Found companion commit d7bd8cf0b348d sharing the same GitHub
  issue link
- [Phase 3] Verified `include/linux/tpm.h` defines
  `tpm_bank_info.digest_size` (dependency satisfied in all target stable
  trees since 2019 via commit 879b589210a9a)
- [Phase 4] `b4 dig -c a74d7197ebe5b`: found single v1 submission
- [Phase 4] `b4 dig -c a74d7197ebe5b -a`: only v1, no earlier revisions
- [Phase 4] `b4 dig -c a74d7197ebe5b -w`: maintainer Mimi Zohar was a
  recipient
- [Phase 4] Read `/tmp/thread.mbx`: confirmed maintainer review, no
  stable nomination, no KASAN mention in thread
- [Phase 4] WebFetch github issue #14: confirmed OOB bug rationale
- [Phase 4] WebFetch marc.info original patch: confirmed patch content
  matches commit
- [Phase 5] Verified `ima_measurements_show` is reachable via read() on
  `/sys/kernel/security/ima/*_runtime_measurements*`
- [Phase 6] `git show stable-push/linux-6.12.y:.../ima_fs.c`: confirmed
  buggy line present at 184
- [Phase 6] `git show stable-push/linux-6.6.y:.../ima_fs.c`: uses
  `TPM_DIGEST_SIZE`, no crypto agility, bug doesn't exist there
- [Phase 6] `git show stable-push/linux-6.1.y`: ima_fs.c does not have
  the buggy code
- [Phase 6] `git log stable-push/linux-6.12.y`: neither this commit nor
  d7bd8cf0b348d is yet in 6.12.y
- [Phase 8] Failure mode: OOB read of
  `hash_digest_size[HASH_ALGO__LAST]` - triggered by reading per-algo
  measurement file on TPM with unsupported algo
- UNVERIFIED: Whether d7bd8cf0b348d has been auto-selected by Fixes:-tag
  tooling; whether user-visible symptoms (without KASAN) would be
  observable in the field.

## Conclusion

This commit is a small, surgical fix for a real out-of-bounds read bug
that affects stable trees from v6.12 onwards. While the commit message
frames it as an "add a field" improvement rather than a bug fix, the
code change explicitly eliminates an OOB read in the user-reachable path
`ima_measurements_show()` / `ima_ascii_measurements_show()` when a TPM
exposes an algorithm not supported by the kernel crypto subsystem. The
companion commit `d7bd8cf0b348d` addresses the parallel boot-time OOB
(in `create_securityfs_measurement_lists`) and has a `Fixes:` tag, so it
will likely be auto-selected. If d7bd8cf0b348d reaches stable (as it
should), this commit is needed to plug the remaining runtime OOB on the
same hardware.

**YES**

 security/integrity/ima/ima.h        |  1 +
 security/integrity/ima/ima_crypto.c |  6 ++++++
 security/integrity/ima/ima_fs.c     | 18 ++++++------------
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 89ebe98ffc5e5..c38a9eb945b68 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -53,6 +53,7 @@ extern atomic_t ima_setxattr_allowed_hash_algorithms;
 struct ima_algo_desc {
 	struct crypto_shash *tfm;
 	enum hash_algo algo;
+	unsigned int digest_size;
 };

 /* set during initialization */
diff --git a/security/integrity/ima/ima_crypto.c b/security/integrity/ima/ima_crypto.c
index aff61643415de..10022b0db4d58 100644
--- a/security/integrity/ima/ima_crypto.c
+++ b/security/integrity/ima/ima_crypto.c
@@ -109,6 +109,7 @@ static struct crypto_shash *ima_alloc_tfm(enum hash_algo algo)

 int __init ima_init_crypto(void)
 {
+	unsigned int digest_size;
 	enum hash_algo algo;
 	long rc;
 	int i;
@@ -147,7 +148,9 @@ int __init ima_init_crypto(void)

 	for (i = 0; i < NR_BANKS(ima_tpm_chip); i++) {
 		algo = ima_tpm_chip->allocated_banks[i].crypto_id;
+		digest_size = ima_tpm_chip->allocated_banks[i].digest_size;
 		ima_algo_array[i].algo = algo;
+		ima_algo_array[i].digest_size = digest_size;

 		/* unknown TPM algorithm */
 		if (algo == HASH_ALGO__LAST)
@@ -183,12 +186,15 @@ int __init ima_init_crypto(void)
 		}

 		ima_algo_array[ima_sha1_idx].algo = HASH_ALGO_SHA1;
+		ima_algo_array[ima_sha1_idx].digest_size = SHA1_DIGEST_SIZE;
 	}

 	if (ima_hash_algo_idx >= NR_BANKS(ima_tpm_chip) &&
 	    ima_hash_algo_idx != ima_sha1_idx) {
+		digest_size = hash_digest_size[ima_hash_algo];
 		ima_algo_array[ima_hash_algo_idx].tfm = ima_shash_tfm;
 		ima_algo_array[ima_hash_algo_idx].algo = ima_hash_algo;
+		ima_algo_array[ima_hash_algo_idx].digest_size = digest_size;
 	}

 	return 0;
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 012a58959ff02..23d3a14b8ce36 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -132,16 +132,12 @@ int ima_measurements_show(struct seq_file *m, void *v)
 	char *template_name;
 	u32 pcr, namelen, template_data_len; /* temporary fields */
 	bool is_ima_template = false;
-	enum hash_algo algo;
 	int i, algo_idx;

 	algo_idx = ima_sha1_idx;
-	algo = HASH_ALGO_SHA1;

-	if (m->file != NULL) {
+	if (m->file != NULL)
 		algo_idx = (unsigned long)file_inode(m->file)->i_private;
-		algo = ima_algo_array[algo_idx].algo;
-	}

 	/* get entry */
 	e = qe->entry;
@@ -160,7 +156,8 @@ int ima_measurements_show(struct seq_file *m, void *v)
 	ima_putc(m, &pcr, sizeof(e->pcr));

 	/* 2nd: template digest */
-	ima_putc(m, e->digests[algo_idx].digest, hash_digest_size[algo]);
+	ima_putc(m, e->digests[algo_idx].digest,
+		 ima_algo_array[algo_idx].digest_size);

 	/* 3rd: template name size */
 	namelen = !ima_canonical_fmt ? strlen(template_name) :
@@ -229,16 +226,12 @@ static int ima_ascii_measurements_show(struct seq_file *m, void *v)
 	struct ima_queue_entry *qe = v;
 	struct ima_template_entry *e;
 	char *template_name;
-	enum hash_algo algo;
 	int i, algo_idx;

 	algo_idx = ima_sha1_idx;
-	algo = HASH_ALGO_SHA1;

-	if (m->file != NULL) {
+	if (m->file != NULL)
 		algo_idx = (unsigned long)file_inode(m->file)->i_private;
-		algo = ima_algo_array[algo_idx].algo;
-	}

 	/* get entry */
 	e = qe->entry;
@@ -252,7 +245,8 @@ static int ima_ascii_measurements_show(struct seq_file *m, void *v)
 	seq_printf(m, "%2d ", e->pcr);

 	/* 2nd: template hash */
-	ima_print_digest(m, e->digests[algo_idx].digest, hash_digest_size[algo]);
+	ima_print_digest(m, e->digests[algo_idx].digest,
+			 ima_algo_array[algo_idx].digest_size);

 	/* 3th:  template name */
 	seq_printf(m, " %s", template_name);
-- 
2.53.0

^ permalink raw reply related

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Yeoreum Yun @ 2026-04-20 13:00 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas, will
In-Reply-To: <aeYbdmshGZJ4GhXd@google.com>


Hi Sebastian,
> On Fri, Apr 17, 2026 at 06:57:59PM +0100, Yeoreum Yun wrote:
>
> Hello Yeoreum,
>
>
> > When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> > Otherwise, pKVM cannot negotiate the FF-A version or
> > obtain RX/TX buffer information, leading to failures in FF-A calls.
>
> At the moment this already happens after you move back ffa_init() to
> device_initcall().

How? the kvm_arm_init() is device_initcall() if both built as built-in.

>
> >
> > During FF-A driver initialization, check whether pKVM has been initialized.
> > If not, defer probing of the FF-A driver.
> >
>
> I don't think you need to add this dependency. pKVM is
> installed through KVM's module_init() which ends up calling hyp_ffa_init() to
> do the proxy initialization. The ARM-FFA driver comes after it (since
> pKVM is arch specific code). We don't have to call finalize_pkvm(..) to
> be able to handle smc(FF-A) calls in the hyp-proxy.
>

As Marc said, the before finalised_pkvm(), smc wouldn't be trapped
to pKVM. IOW, in case when both built as built-in,
if ffa_init() is called before finalised_pkvm(),
it couldn't proxy the FFA_VERSION, FFA_RXTX_MAP and FFA_PARTITION_INFO_GET
called by ffa_init().

How can you gurantee hyp_ffa_init() which is called by kvm_arm_init()
comes first even kvm_arm_init() and ffa_init() are on device_initcall?

[...]

Thanks


--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Marc Zyngier @ 2026-04-20 12:46 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: Yeoreum Yun, linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will
In-Reply-To: <aeYbdmshGZJ4GhXd@google.com>

On Mon, 20 Apr 2026 13:32:32 +0100,
Sebastian Ene <sebastianene@google.com> wrote:
> 
> On Fri, Apr 17, 2026 at 06:57:59PM +0100, Yeoreum Yun wrote:
> 
> Hello Yeoreum,
> 
> 
> > When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> > Otherwise, pKVM cannot negotiate the FF-A version or
> > obtain RX/TX buffer information, leading to failures in FF-A calls.
> 
> At the moment this already happens after you move back ffa_init() to
> device_initcall().

But relying on this sort of ordering is just making things more
fragile.

> 
> > 
> > During FF-A driver initialization, check whether pKVM has been initialized.
> > If not, defer probing of the FF-A driver.
> > 
> 
> I don't think you need to add this dependency. pKVM is
> installed through KVM's module_init() which ends up calling hyp_ffa_init() to
> do the proxy initialization. The ARM-FFA driver comes after it (since
> pKVM is arch specific code). We don't have to call finalize_pkvm(..) to
> be able to handle smc(FF-A) calls in the hyp-proxy.

You do. Without the finalisation, SMCs are not trapped by EL2.

And even if it did, relying on such hack is just wrong.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Sebastian Ene @ 2026-04-20 12:32 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas, will
In-Reply-To: <20260417175759.3191279-5-yeoreum.yun@arm.com>

On Fri, Apr 17, 2026 at 06:57:59PM +0100, Yeoreum Yun wrote:

Hello Yeoreum,


> When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> Otherwise, pKVM cannot negotiate the FF-A version or
> obtain RX/TX buffer information, leading to failures in FF-A calls.

At the moment this already happens after you move back ffa_init() to
device_initcall().

> 
> During FF-A driver initialization, check whether pKVM has been initialized.
> If not, defer probing of the FF-A driver.
> 

I don't think you need to add this dependency. pKVM is
installed through KVM's module_init() which ends up calling hyp_ffa_init() to
do the proxy initialization. The ARM-FFA driver comes after it (since
pKVM is arch specific code). We don't have to call finalize_pkvm(..) to
be able to handle smc(FF-A) calls in the hyp-proxy.

> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>  arch/arm64/kvm/arm.c              |  1 +
>  drivers/firmware/arm_ffa/driver.c | 12 ++++++++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 410ffd41fd73..0f517b1c05cd 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -119,6 +119,7 @@ bool is_kvm_arm_initialised(void)
>  {
>  	return kvm_arm_initialised;
>  }
> +EXPORT_SYMBOL(is_kvm_arm_initialised);
> 
>  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>  {
> diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c
> index 02c76ac1570b..2647d6554afd 100644
> --- a/drivers/firmware/arm_ffa/driver.c
> +++ b/drivers/firmware/arm_ffa/driver.c
> @@ -42,6 +42,8 @@
>  #include <linux/uuid.h>
>  #include <linux/xarray.h>
> 
> +#include <asm/virt.h>
> +
>  #include "common.h"
> 
>  #define FFA_DRIVER_VERSION	FFA_VERSION_1_2
> @@ -2035,6 +2037,16 @@ static int __init ffa_init(void)
>  	u32 buf_sz;
>  	size_t rxtx_bufsz = SZ_4K;
> 
> +	/*
> +	 * When pKVM is enabled, the FF-A driver must be initialized
> +	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
> +	 * the FF-A version or obtain RX/TX buffer information,
> +	 * which leads to failures in FF-A calls.
> +	 */
> +	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
> +	    !is_kvm_arm_initialised())
> +		return -EPROBE_DEFER;
> +
>  	ret = ffa_transport_init(&invoke_ffa_fn);
>  	if (ret)
>  		return ret;
> --
> LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
> 

Thanks,
Sebastian

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Yeoreum Yun @ 2026-04-20 10:56 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marc Zyngier, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, serge,
	zohar, roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe,
	jarkko, jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas, sebastianene
In-Reply-To: <aeYDMEgWdt8F9jWb@willie-the-truck>

Hi Will,

> [+Seb for the pKVM FFA bits]
>
> On Mon, Apr 20, 2026 at 10:25:29AM +0100, Yeoreum Yun wrote:
> > > On Sun, Apr 19, 2026 at 12:12:44PM +0100, Yeoreum Yun wrote:
> > > > > On Sat, 18 Apr 2026 11:34:30 +0100,
> > > > > Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > > > >
> > > > > > > > @@ -2035,6 +2037,16 @@ static int __init ffa_init(void)
> > > > > > > >  	u32 buf_sz;
> > > > > > > >  	size_t rxtx_bufsz = SZ_4K;
> > > > > > > >
> > > > > > > > +	/*
> > > > > > > > +	 * When pKVM is enabled, the FF-A driver must be initialized
> > > > > > > > +	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
> > > > > > > > +	 * the FF-A version or obtain RX/TX buffer information,
> > > > > > > > +	 * which leads to failures in FF-A calls.
> > > > > > > > +	 */
> > > > > > > > +	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
> > > > > > > > +	    !is_kvm_arm_initialised())
> > > > > > > > +		return -EPROBE_DEFER;
> > > > > > > > +
> > > > > > >
> > > > > > > That's still fundamentally wrong: pkvm is not ready until
> > > > > > > finalize_pkvm() has finished, and that's not indicated by
> > > > > > > is_kvm_arm_initialised().
> > > > > >
> > > > > > Thanks. I miss the TSC bit set in here.
> > > > >
> > > > > That's the least of the problems. None of the infrastructure is in
> > > > > place at this stage...
> > > > >
> > > > > > IMHO, I'd like to make an new state check function --
> > > > > > is_pkvm_arm_initialised() so that ff-a driver to know whether
> > > > > > pkvm is initialised.
> > > > >
> > > > > Doesn't sound great, TBH.
> > > > >
> > > > > > or any other suggestion?
> > > > >
> > > > > Instead of adding more esoteric predicates, I'd rather you build on an
> > > > > existing infrastructure. You have a dependency on KVM, use something
> > > > > that is designed to enforce dependencies. Device links spring to mind
> > > > > as something designed for that.
> > > > >
> > > > > Can you look into enabling this for KVM? If that's possible, then it
> > > > > should be easy enough to delay the actual KVM registration after pKVM
> > > > > is finalised.
> > > >
> > > > or what about some event notifier? Just like:
> > >
> > > This seems a bit over-engineered to me. Why don't you just split the
> > > FF-A initialisation into two steps: an early part which does the version
> > > negotiation and then a later part which can fit in with whatever
> > > dependencies you have on the TPM?
> >
> > Sorry, I may have misunderstood your suggestion and
> > I might be in missing your point.
> >
> > But, The issue here is that FFA_VERSION, FFA_RXTX_MAP, and
> > FFA_PARTITION_INFO_GET, which are invoked from ffa_init()
> > as part of early initialisation, must be trapped by pKVM.
> >
> > In other words, even the early part of the initialization,
> > including version negotiation, needs to happen after pKVM
> > is initialized.
> >
> > Because of this dependency, simply splitting the FF-A
> > initialization into two phases within the driver does not
> > seem sufficient, as it still requires knowing when pKVM
> > has been initialized.
> >
> > Am I missing something?
>
> Ah sorry, I mixed up the ordering of 'module_init' vs 'rootfs_initcall'
> and thought you wanted to probe the version earlier. But then I'm still
> confused because, prior to 0e0546eabcd6 ("firmware: arm_ffa: Change
> initcall level of ffa_init() to rootfs_initcall"), ffa_init() was a
> 'device_initcall' which is still called earlier than finalize_pkvm().

Right, and this is what I missed when writing patch
0e0546eabcd6 ("firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall").
and it still exists even if it's device call.

However, rather than changing ffa_init to rootfs_initcall, moving ima_init
to late_initcall_sync is a better approach, as it also addresses similar
issues for TPM devices that do not use FF-A. For this reason,
the FF-A-related changes were reverted.

As a result, patch 4/4 addresses an issue that existed independently of
0e0546eabcd6, as you pointed out.

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH 1/4] security: ima: move ima_init into late_initcall_sync
From: Jonathan McDowell @ 2026-04-20 10:32 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas, will
In-Reply-To: <20260417175759.3191279-2-yeoreum.yun@arm.com>

On Fri, Apr 17, 2026 at 06:57:56PM +0100, Yeoreum Yun wrote:
>To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
>the TPM driver must be built as built-in and
>must be probed before the IMA subsystem is initialized.
>
>However, when the TPM device operates over the FF-A protocol using
>the CRB interface, probing fails and returns -EPROBE_DEFER if
>the tpm_crb_ffa device — an FF-A device that provides the communication
>interface to the tpm_crb driver — has not yet been probed.
>
>To ensure the TPM device operating over the FF-A protocol with
>the CRB interface is probed before IMA initialization,
>the following conditions must be met:
>
>   1. The corresponding ffa_device must be registered,
>      which is done via ffa_init().
>
>   2. The tpm_crb_driver must successfully probe this device via
>      tpm_crb_ffa_init().
>
>   3. The tpm_crb driver using CRB over FF-A can then
>      be probed successfully. (See crb_acpi_add() and
>      tpm_crb_ffa_init() for reference.)
>
>Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
>all registered with device_initcall, which means crb_acpi_driver_init() may
>be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
>
>When this occurs, probing the TPM device is deferred.
>However, the deferred probe can happen after the IMA subsystem
>has already been initialized, since IMA initialization is performed
>during late_initcall, and deferred_probe_initcall() is performed
>at the same level.
>
>To resolve this, move ima_init() into late_inicall_sync level
>so that let IMA not miss TPM PCR value when generating boot_aggregate
>log though TPM device presents in the system.
>
>Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>

Awesome. This fixes the problems I saw with an SPI TPM on an NVIDIA 
GB200 system and reported in 

https://lore.kernel.org/linux-integrity/aYXEepLhUouN5f99@earth.li/

Reviewed-by: Jonathan McDowell <noodles@meta.com>
Tested-by: Jonathan McDowell <noodles@meta.com>

>---
> include/linux/lsm_hooks.h         |  2 ++
> security/integrity/ima/ima_main.c |  2 +-
> security/lsm_init.c               | 13 +++++++++++--
> 3 files changed, 14 insertions(+), 3 deletions(-)
>
>diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
>index d48bf0ad26f4..88fe105b7f00 100644
>--- a/include/linux/lsm_hooks.h
>+++ b/include/linux/lsm_hooks.h
>@@ -166,6 +166,7 @@ enum lsm_order {
>  * @initcall_fs: LSM callback for fs_initcall setup, optional
>  * @initcall_device: LSM callback for device_initcall() setup, optional
>  * @initcall_late: LSM callback for late_initcall() setup, optional
>+ * @initcall_late_sync: LSM callback for late_initcall_sync() setup, optional
>  */
> struct lsm_info {
> 	const struct lsm_id *id;
>@@ -181,6 +182,7 @@ struct lsm_info {
> 	int (*initcall_fs)(void);
> 	int (*initcall_device)(void);
> 	int (*initcall_late)(void);
>+	int (*initcall_late_sync)(void);
> };
>
> #define DEFINE_LSM(lsm)							\
>diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
>index 1d6229b156fb..ace280fa3212 100644
>--- a/security/integrity/ima/ima_main.c
>+++ b/security/integrity/ima/ima_main.c
>@@ -1320,5 +1320,5 @@ DEFINE_LSM(ima) = {
> 	.order = LSM_ORDER_LAST,
> 	.blobs = &ima_blob_sizes,
> 	/* Start IMA after the TPM is available */
>-	.initcall_late = init_ima,
>+	.initcall_late_sync = init_ima,
> };
>diff --git a/security/lsm_init.c b/security/lsm_init.c
>index 573e2a7250c4..4e5c59beb82a 100644
>--- a/security/lsm_init.c
>+++ b/security/lsm_init.c
>@@ -547,13 +547,22 @@ device_initcall(security_initcall_device);
>  * security_initcall_late - Run the LSM late initcalls
>  */
> static int __init security_initcall_late(void)
>+{
>+	return lsm_initcall(late);
>+}
>+late_initcall(security_initcall_late);
>+
>+/**
>+ * security_initcall_late_sync - Run the LSM late initcalls sync
>+ */
>+static int __init security_initcall_late_sync(void)
> {
> 	int rc;
>
>-	rc = lsm_initcall(late);
>+	rc = lsm_initcall(late_sync);
> 	lsm_pr_dbg("all enabled LSMs fully activated\n");
> 	call_blocking_lsm_notifier(LSM_STARTED_ALL, NULL);
>
> 	return rc;
> }
>-late_initcall(security_initcall_late);
>+late_initcall_sync(security_initcall_late_sync);
>--
>LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
>
>

J.

-- 
] https://www.earth.li/~noodles/ []  "Do I scare you?" "No." "Do you   [
]  PGP/GPG Key @ the.earth.li    []   want me to?" -- Wayne's World.   [
] via keyserver, web or email.   []                                    [
] RSA: 4096/0x94FA372B2DA8B985   []                                    [

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Will Deacon @ 2026-04-20 10:42 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: Marc Zyngier, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, serge,
	zohar, roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe,
	jarkko, jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas, sebastianene
In-Reply-To: <aeXxCe4hdizdQbFD@e129823.arm.com>

[+Seb for the pKVM FFA bits]

On Mon, Apr 20, 2026 at 10:25:29AM +0100, Yeoreum Yun wrote:
> > On Sun, Apr 19, 2026 at 12:12:44PM +0100, Yeoreum Yun wrote:
> > > > On Sat, 18 Apr 2026 11:34:30 +0100,
> > > > Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > > >
> > > > > > > @@ -2035,6 +2037,16 @@ static int __init ffa_init(void)
> > > > > > >  	u32 buf_sz;
> > > > > > >  	size_t rxtx_bufsz = SZ_4K;
> > > > > > >
> > > > > > > +	/*
> > > > > > > +	 * When pKVM is enabled, the FF-A driver must be initialized
> > > > > > > +	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
> > > > > > > +	 * the FF-A version or obtain RX/TX buffer information,
> > > > > > > +	 * which leads to failures in FF-A calls.
> > > > > > > +	 */
> > > > > > > +	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
> > > > > > > +	    !is_kvm_arm_initialised())
> > > > > > > +		return -EPROBE_DEFER;
> > > > > > > +
> > > > > >
> > > > > > That's still fundamentally wrong: pkvm is not ready until
> > > > > > finalize_pkvm() has finished, and that's not indicated by
> > > > > > is_kvm_arm_initialised().
> > > > >
> > > > > Thanks. I miss the TSC bit set in here.
> > > >
> > > > That's the least of the problems. None of the infrastructure is in
> > > > place at this stage...
> > > >
> > > > > IMHO, I'd like to make an new state check function --
> > > > > is_pkvm_arm_initialised() so that ff-a driver to know whether
> > > > > pkvm is initialised.
> > > >
> > > > Doesn't sound great, TBH.
> > > >
> > > > > or any other suggestion?
> > > >
> > > > Instead of adding more esoteric predicates, I'd rather you build on an
> > > > existing infrastructure. You have a dependency on KVM, use something
> > > > that is designed to enforce dependencies. Device links spring to mind
> > > > as something designed for that.
> > > >
> > > > Can you look into enabling this for KVM? If that's possible, then it
> > > > should be easy enough to delay the actual KVM registration after pKVM
> > > > is finalised.
> > >
> > > or what about some event notifier? Just like:
> >
> > This seems a bit over-engineered to me. Why don't you just split the
> > FF-A initialisation into two steps: an early part which does the version
> > negotiation and then a later part which can fit in with whatever
> > dependencies you have on the TPM?
> 
> Sorry, I may have misunderstood your suggestion and
> I might be in missing your point.
> 
> But, The issue here is that FFA_VERSION, FFA_RXTX_MAP, and
> FFA_PARTITION_INFO_GET, which are invoked from ffa_init()
> as part of early initialisation, must be trapped by pKVM.
> 
> In other words, even the early part of the initialization,
> including version negotiation, needs to happen after pKVM
> is initialized.
> 
> Because of this dependency, simply splitting the FF-A
> initialization into two phases within the driver does not
> seem sufficient, as it still requires knowing when pKVM
> has been initialized.
> 
> Am I missing something?

Ah sorry, I mixed up the ordering of 'module_init' vs 'rootfs_initcall'
and thought you wanted to probe the version earlier. But then I'm still
confused because, prior to 0e0546eabcd6 ("firmware: arm_ffa: Change
initcall level of ffa_init() to rootfs_initcall"), ffa_init() was a
'device_initcall' which is still called earlier than finalize_pkvm().

Will

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Yeoreum Yun @ 2026-04-20  9:25 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marc Zyngier, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, serge,
	zohar, roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe,
	jarkko, jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas
In-Reply-To: <aeXp7WSqpXNytNPG@willie-the-truck>

Hi Will,

> On Sun, Apr 19, 2026 at 12:12:44PM +0100, Yeoreum Yun wrote:
> > Hi Marc,
> >
> > > On Sat, 18 Apr 2026 11:34:30 +0100,
> > > Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > > >
> > > > > > @@ -2035,6 +2037,16 @@ static int __init ffa_init(void)
> > > > > >  	u32 buf_sz;
> > > > > >  	size_t rxtx_bufsz = SZ_4K;
> > > > > >
> > > > > > +	/*
> > > > > > +	 * When pKVM is enabled, the FF-A driver must be initialized
> > > > > > +	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
> > > > > > +	 * the FF-A version or obtain RX/TX buffer information,
> > > > > > +	 * which leads to failures in FF-A calls.
> > > > > > +	 */
> > > > > > +	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
> > > > > > +	    !is_kvm_arm_initialised())
> > > > > > +		return -EPROBE_DEFER;
> > > > > > +
> > > > >
> > > > > That's still fundamentally wrong: pkvm is not ready until
> > > > > finalize_pkvm() has finished, and that's not indicated by
> > > > > is_kvm_arm_initialised().
> > > >
> > > > Thanks. I miss the TSC bit set in here.
> > >
> > > That's the least of the problems. None of the infrastructure is in
> > > place at this stage...
> > >
> > > > IMHO, I'd like to make an new state check function --
> > > > is_pkvm_arm_initialised() so that ff-a driver to know whether
> > > > pkvm is initialised.
> > >
> > > Doesn't sound great, TBH.
> > >
> > > > or any other suggestion?
> > >
> > > Instead of adding more esoteric predicates, I'd rather you build on an
> > > existing infrastructure. You have a dependency on KVM, use something
> > > that is designed to enforce dependencies. Device links spring to mind
> > > as something designed for that.
> > >
> > > Can you look into enabling this for KVM? If that's possible, then it
> > > should be easy enough to delay the actual KVM registration after pKVM
> > > is finalised.
> >
> > or what about some event notifier? Just like:
>
> This seems a bit over-engineered to me. Why don't you just split the
> FF-A initialisation into two steps: an early part which does the version
> negotiation and then a later part which can fit in with whatever
> dependencies you have on the TPM?

Sorry, I may have misunderstood your suggestion and
I might be in missing your point.

But, The issue here is that FFA_VERSION, FFA_RXTX_MAP, and
FFA_PARTITION_INFO_GET, which are invoked from ffa_init()
as part of early initialisation, must be trapped by pKVM.

In other words, even the early part of the initialization,
including version negotiation, needs to happen after pKVM
is initialized.

Because of this dependency, simply splitting the FF-A
initialization into two phases within the driver does not
seem sufficient, as it still requires knowing when pKVM
has been initialized.

Am I missing something?

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Will Deacon @ 2026-04-20  8:55 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: Marc Zyngier, linux-security-module, linux-kernel,
	linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, serge,
	zohar, roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe,
	jarkko, jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas
In-Reply-To: <aeS4rAeVQ0yJIPYw@e129823.arm.com>

On Sun, Apr 19, 2026 at 12:12:44PM +0100, Yeoreum Yun wrote:
> Hi Marc,
> 
> > On Sat, 18 Apr 2026 11:34:30 +0100,
> > Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> > >
> > > > > @@ -2035,6 +2037,16 @@ static int __init ffa_init(void)
> > > > >  	u32 buf_sz;
> > > > >  	size_t rxtx_bufsz = SZ_4K;
> > > > >
> > > > > +	/*
> > > > > +	 * When pKVM is enabled, the FF-A driver must be initialized
> > > > > +	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
> > > > > +	 * the FF-A version or obtain RX/TX buffer information,
> > > > > +	 * which leads to failures in FF-A calls.
> > > > > +	 */
> > > > > +	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
> > > > > +	    !is_kvm_arm_initialised())
> > > > > +		return -EPROBE_DEFER;
> > > > > +
> > > >
> > > > That's still fundamentally wrong: pkvm is not ready until
> > > > finalize_pkvm() has finished, and that's not indicated by
> > > > is_kvm_arm_initialised().
> > >
> > > Thanks. I miss the TSC bit set in here.
> >
> > That's the least of the problems. None of the infrastructure is in
> > place at this stage...
> >
> > > IMHO, I'd like to make an new state check function --
> > > is_pkvm_arm_initialised() so that ff-a driver to know whether
> > > pkvm is initialised.
> >
> > Doesn't sound great, TBH.
> >
> > > or any other suggestion?
> >
> > Instead of adding more esoteric predicates, I'd rather you build on an
> > existing infrastructure. You have a dependency on KVM, use something
> > that is designed to enforce dependencies. Device links spring to mind
> > as something designed for that.
> >
> > Can you look into enabling this for KVM? If that's possible, then it
> > should be easy enough to delay the actual KVM registration after pKVM
> > is finalised.
> 
> or what about some event notifier? Just like:

This seems a bit over-engineered to me. Why don't you just split the
FF-A initialisation into two steps: an early part which does the version
negotiation and then a later part which can fit in with whatever
dependencies you have on the TPM?

Will

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Yeoreum Yun @ 2026-04-19 11:12 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will
In-Reply-To: <87pl3vb5bm.wl-maz@kernel.org>

Hi Marc,

> On Sat, 18 Apr 2026 11:34:30 +0100,
> Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> >
> > > > @@ -2035,6 +2037,16 @@ static int __init ffa_init(void)
> > > >  	u32 buf_sz;
> > > >  	size_t rxtx_bufsz = SZ_4K;
> > > >
> > > > +	/*
> > > > +	 * When pKVM is enabled, the FF-A driver must be initialized
> > > > +	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
> > > > +	 * the FF-A version or obtain RX/TX buffer information,
> > > > +	 * which leads to failures in FF-A calls.
> > > > +	 */
> > > > +	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
> > > > +	    !is_kvm_arm_initialised())
> > > > +		return -EPROBE_DEFER;
> > > > +
> > >
> > > That's still fundamentally wrong: pkvm is not ready until
> > > finalize_pkvm() has finished, and that's not indicated by
> > > is_kvm_arm_initialised().
> >
> > Thanks. I miss the TSC bit set in here.
>
> That's the least of the problems. None of the infrastructure is in
> place at this stage...
>
> > IMHO, I'd like to make an new state check function --
> > is_pkvm_arm_initialised() so that ff-a driver to know whether
> > pkvm is initialised.
>
> Doesn't sound great, TBH.
>
> > or any other suggestion?
>
> Instead of adding more esoteric predicates, I'd rather you build on an
> existing infrastructure. You have a dependency on KVM, use something
> that is designed to enforce dependencies. Device links spring to mind
> as something designed for that.
>
> Can you look into enabling this for KVM? If that's possible, then it
> should be easy enough to delay the actual KVM registration after pKVM
> is finalised.

or what about some event notifier? Just like:

----------&<-----------

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index b51ab6840f9c..ad038a3b8727 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -68,6 +68,8 @@
 #include <asm/sysreg.h>
 #include <asm/cpufeature.h>

+struct notifier_block;
+
 /*
  * __boot_cpu_mode records what mode CPUs were booted in.
  * A correctly-implemented bootloader must start all CPUs in the same mode:
@@ -166,6 +168,15 @@ static inline bool is_hyp_nvhe(void)
 	return is_hyp_mode_available() && !is_kernel_in_hyp_mode();
 }

+enum kvm_arm_event {
+	PKVM_INITIALISED,
+	KVM_ARM_EVENT_MAX,
+};
+
+extern int kvm_arm_event_notifier_call_chain(enum kvm_arm_event event, void *data);
+extern int kvm_arm_event_notifier_register(struct notifier_block *nb);
+extern int kvm_arm_event_notifier_unregister(struct notifier_block *nb);
+
 #endif /* __ASSEMBLER__ */

 #endif /* ! __ASM__VIRT_H */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 410ffd41fd73..8da10049ab65 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -14,6 +14,7 @@
 #include <linux/vmalloc.h>
 #include <linux/fs.h>
 #include <linux/mman.h>
+#include <linux/notifier.h>
 #include <linux/sched.h>
 #include <linux/kvm.h>
 #include <linux/kvm_irqfd.h>
@@ -111,6 +112,8 @@ DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);

 DECLARE_KVM_NVHE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);

+BLOCKING_NOTIFIER_HEAD(kvm_arm_event_notifier_head);
+
 static bool vgic_present, kvm_arm_initialised;

 static DEFINE_PER_CPU(unsigned char, kvm_hyp_initialized);
@@ -3064,4 +3067,22 @@ enum kvm_mode kvm_get_mode(void)
 	return kvm_mode;
 }

+int kvm_arm_event_notifier_call_chain(enum kvm_arm_event event, void *data)
+{
+	return blocking_notifier_call_chain(&kvm_arm_event_notifier_head,
+					    event, data);
+}
+
+int kvm_arm_event_notifier_register(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_register(&kvm_arm_event_notifier_head, nb);
+}
+EXPORT_SYMBOL_GPL(kvm_arm_event_notifier_register);
+
+int kvm_arm_event_notifier_unregister(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_unregister(&kvm_arm_event_notifier_head, nb);
+}
+EXPORT_SYMBOL_GPL(kvm_arm_event_notifier_unregister);
+
 module_init(kvm_arm_init);
diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
index d7a0f69a9982..e76562b0a45a 100644
--- a/arch/arm64/kvm/pkvm.c
+++ b/arch/arm64/kvm/pkvm.c
@@ -280,6 +280,8 @@ static int __init finalize_pkvm(void)
 	ret = pkvm_drop_host_privileges();
 	if (ret)
 		pr_err("Failed to finalize Hyp protection: %d\n", ret);
+	else
+		kvm_arm_event_notifier_call_chain(PKVM_INITIALISED, NULL);

 	return ret;
 }
diff --git a/drivers/firmware/arm_ffa/common.h b/drivers/firmware/arm_ffa/common.h
index 9c6425a81d0d..5cdf4bd222c6 100644
--- a/drivers/firmware/arm_ffa/common.h
+++ b/drivers/firmware/arm_ffa/common.h
@@ -18,9 +18,9 @@ bool ffa_device_is_valid(struct ffa_device *ffa_dev);
 void ffa_device_match_uuid(struct ffa_device *ffa_dev, const uuid_t *uuid);

 #ifdef CONFIG_ARM_FFA_SMCCC
-int __init ffa_transport_init(ffa_fn **invoke_ffa_fn);
+int ffa_transport_init(ffa_fn **invoke_ffa_fn);
 #else
-static inline int __init ffa_transport_init(ffa_fn **invoke_ffa_fn)
+static inline int ffa_transport_init(ffa_fn **invoke_ffa_fn)
 {
 	return -EOPNOTSUPP;
 }
diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c
index 02c76ac1570b..67df053e65b8 100644
--- a/drivers/firmware/arm_ffa/driver.c
+++ b/drivers/firmware/arm_ffa/driver.c
@@ -35,6 +35,7 @@
 #include <linux/module.h>
 #include <linux/mm.h>
 #include <linux/mutex.h>
+#include <linux/notifier.h>
 #include <linux/of_irq.h>
 #include <linux/scatterlist.h>
 #include <linux/slab.h>
@@ -42,6 +43,8 @@
 #include <linux/uuid.h>
 #include <linux/xarray.h>

+#include <asm/virt.h>
+
 #include "common.h"

 #define FFA_DRIVER_VERSION	FFA_VERSION_1_2
@@ -2029,7 +2032,7 @@ static void ffa_notifications_setup(void)
 	ffa_notifications_cleanup();
 }

-static int __init ffa_init(void)
+static int __ffa_init(void)
 {
 	int ret;
 	u32 buf_sz;
@@ -2105,11 +2108,42 @@ static int __init ffa_init(void)
 free_drv_info:
 	kfree(drv_info);
 	return ret;
+
+}
+
+static int ffa_kvm_arm_event_handler(struct notifier_block *nb,
+				     unsigned long event, void *unused)
+{
+	if (event == PKVM_INITIALISED)
+		__ffa_init();
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block ffa_kvm_arm_event_notifier = {
+	.notifier_call = ffa_kvm_arm_event_handler,
+};
+
+static int __init ffa_init(void)
+{
+	/*
+	 * When pKVM is enabled, the FF-A driver must be initialized
+	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
+	 * the FF-A version or obtain RX/TX buffer information,
+	 * which leads to failures in FF-A calls.
+	 */
+	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
+	    !is_pkvm_initialized())
+		return kvm_arm_event_notifier_register(&ffa_kvm_arm_event_notifier);
+
+	return __ffa_init();
 }
 device_initcall(ffa_init);

 static void __exit ffa_exit(void)
 {
+	if (IS_ENABLED(CONFIG_KVM))
+		kvm_arm_event_notifier_unregister(&ffa_kvm_arm_event_notifier);
 	ffa_notifications_cleanup();
 	ffa_partitions_cleanup();
 	ffa_rxtx_unmap();
diff --git a/drivers/firmware/arm_ffa/smccc.c b/drivers/firmware/arm_ffa/smccc.c
index 4d85bfff0a4e..e6125dd9f58f 100644
--- a/drivers/firmware/arm_ffa/smccc.c
+++ b/drivers/firmware/arm_ffa/smccc.c
@@ -17,7 +17,7 @@ static void __arm_ffa_fn_hvc(ffa_value_t args, ffa_value_t *res)
 	arm_smccc_1_2_hvc(&args, res);
 }

-int __init ffa_transport_init(ffa_fn **invoke_ffa_fn)
+int ffa_transport_init(ffa_fn **invoke_ffa_fn)
 {
 	enum arm_smccc_conduit conduit;


> --
> Jazz isn't dead. It just smells funny.

--
Sincerely,
Yeoreum Yun

^ permalink raw reply related

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Marc Zyngier @ 2026-04-19 10:41 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will
In-Reply-To: <aeNeNjfO7i128TIP@e129823.arm.com>

On Sat, 18 Apr 2026 11:34:30 +0100,
Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> 
> > > @@ -2035,6 +2037,16 @@ static int __init ffa_init(void)
> > >  	u32 buf_sz;
> > >  	size_t rxtx_bufsz = SZ_4K;
> > >
> > > +	/*
> > > +	 * When pKVM is enabled, the FF-A driver must be initialized
> > > +	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
> > > +	 * the FF-A version or obtain RX/TX buffer information,
> > > +	 * which leads to failures in FF-A calls.
> > > +	 */
> > > +	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
> > > +	    !is_kvm_arm_initialised())
> > > +		return -EPROBE_DEFER;
> > > +
> >
> > That's still fundamentally wrong: pkvm is not ready until
> > finalize_pkvm() has finished, and that's not indicated by
> > is_kvm_arm_initialised().
> 
> Thanks. I miss the TSC bit set in here.

That's the least of the problems. None of the infrastructure is in
place at this stage...

> IMHO, I'd like to make an new state check function --
> is_pkvm_arm_initialised() so that ff-a driver to know whether
> pkvm is initialised.

Doesn't sound great, TBH.

> or any other suggestion?

Instead of adding more esoteric predicates, I'd rather you build on an
existing infrastructure. You have a dependency on KVM, use something
that is designed to enforce dependencies. Device links spring to mind
as something designed for that.

Can you look into enabling this for KVM? If that's possible, then it
should be easy enough to delay the actual KVM registration after pKVM
is finalised.

Thanks,

	M.

-- 
Jazz isn't dead. It just smells funny.

^ permalink raw reply

* Re: [BUG] landlock: warning in collect_domain_accesses via renameat2 path rename
From: Justin Suess @ 2026-04-18 23:08 UTC (permalink / raw)
  To: 王志; +Cc: linux-security-module, linux-kernel, linux-fsdevel, paul
In-Reply-To: <25536ce2.4391.19d9b3484ff.Coremail.23009200614@stu.xidian.edu.cn>

On Fri, Apr 17, 2026 at 07:30:03PM +0800, 王志 wrote:
> Dear Maintainers,
> 
> When using our customized Syzkaller to fuzz the latest Linux kernel, we discovered a crash related to Landlock during a path rename operation.
> 
> HEAD commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449

This is the initial 6.18 release, without the stable backported fixes.

> git tree: upstream
> 
> Reproducer and logs:
> Output: https://github.com/manual0/crash/blob/main/cebd27007e806e16cf15cb1e0214c24054e8998e/report1
> Kernel config: https://github.com/manual0/crash/blob/main/6.18-syzbot.config
> C reproducer: https://github.com/manual0/crash/blob/main/cebd27007e806e16cf15cb1e0214c24054e8998e/repro.c
> 
> ----------------------------------------
> 
> Analysis:
> 
> The crash is triggered through the following path:
> 
> renameat2
>   → security_path_rename
>   → current_check_refer_path
>   → collect_domain_accesses
> 
> This indicates that a path rename operation triggers Landlock's path access control checks. The crash occurs inside collect_domain_accesses(), which is responsible for collecting the current process's domain access rights.
> 
> The bug is caused by collect_domain_accesses() traversing inconsistent or invalid Landlock ruleset data during rename path permission checks, leading to unsafe memory access.
> ----------------------------------------
> 
> If you fix this issue, please add the following tag to the commit:
> 
> Reported-by: Zhi Wang <wangzhi@stu.xidian.edu.cn>
>
This was fixed in 6.18.2 with
cadb28f8b3fd6908e3051e86158c65c3a8e1c907 (landlock: Fix handling of
disconnected directories) [1]

So this has been fixed upstream and backported already.

Please target fuzzing against a supported tag.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-6.18.y&id=cadb28f8b3fd6908e3051e86158c65c3a8e1c907

Justin
> Thanks,
> Zhi Wang

^ permalink raw reply

* Re: [RFC PATCH 08/20] bpf: Add Landlock ruleset map type
From: Justin Suess @ 2026-04-18 21:50 UTC (permalink / raw)
  To: Song Liu
  Cc: Mickaël Salaün, ast, daniel, andrii, kpsingh, paul,
	viro, brauner, kees, gnoack, jack, jmorris, serge, yonghong.song,
	martin.lau, m, eddyz87, john.fastabend, sdf, skhan, bpf,
	linux-security-module, linux-kernel, linux-fsdevel
In-Reply-To: <CAPhsuW4CoskfaqEE5yS2LU_mFvNBDsKc5OX1+f=Lkduc2ykSdQ@mail.gmail.com>

On Fri, Apr 17, 2026 at 01:42:02PM -0700, Song Liu wrote:
> On Fri, Apr 17, 2026 at 1:33 PM Justin Suess <utilityemal77@gmail.com> wrote:
> [...]
> > > > > to the caller) and pass them as file descriptor?
> > > > This "pass them as a file descriptor" is the tricky part. It would be
> > > > very convenient if we could send the fd to bpf from userspace and have
> > > > it be implicitly converted (like in the BPF_MAP_TYPE_LANDLOCK_RULESET
> > > > implementation) in one step, but I just don't see a way to do that with
> > > > the bpf_landlock_get_ruleset_from_fd kfunc approach.
> > >
> > > Song's idea to have a generic FD map looks promising.
> > >
> >
> > I agree the generic FD map sounds like a good fit.
> 
> Well, I am not 100% sure a generic FD map adds enough value
> on top of current __kptr solutions. This will be more tricky if we
> have to touch file_operations.
> 
> > So this would be three parts like:
> >
> > 1. The new point-of-no-return flags for NNP and staging domain to
> > execution time in Landlock. Selftests and doc updates.
> > 2. The generic FD map implementation for bpf. Selftests and doc updates.
> > 3. The BPF kfunc implementations for Landlock using the same point-of-no
> > return staging. Selftests and doc updates.
> >
> > The scope of which is probably too big for one series.
> >
> > Luckily part 1 is pretty close to being done as part of my work for v2
> > of this series, and can standalone as a preparatory series for Landlock,
> > since it adds flags and features that have utility outside of BPF.
> >
> > Open for ideas on how to split this up (or even better, for some help in
> > implementation or prior works).
> >
> > I'd like to get some feedback and figue out what this generic fd map
> > should look like and get some more eyes on that idea to avoid wasting
> > reviewer time on an unsuitable implementation.
> 
> I will think more about 2. If it indeed adds good value, the upcoming
> LSF/MM/BPF is a good opportunity to move this forward.
> 
> In the meanwhile, we still need kfuncs to access landlock ruleset.
> Therefore, any work on that front should be useful.
>
Instead of a new map type, could the same usecase be fulfilled as a
flag for bpf_map__update_elem? (BPF_FROM_FD?)

  int bpf_map__update_elem(
    map,
    &key, 
    sizeof(key),
    &fd,
    sizeof(fd), 
    BPF_FROM_FD);

We could register an operation at the BTF type level to acquire a
reference to an underlying kernel object from a struct file* to a
specific BTF type, like how destructors are registered.

Something like void* bpf_kptr_acquire_from_file_t(struct file*)

(adding it to btf_field_kptr).

Then this would get a reference on the kernel object for the underlying
file and insert it as a kptr into the map if the file indeed points to
the correct type.

This would be valid only for a map holding a supported __kptr type
implementing the bpf_kptr_acquire_from_file operation.

This flag would allow inserting __kptr from userspace (previously
impossible) with a file descriptor.

This wouldn't need any new file_operations changes nor any new map
types.

This could be implemented for specific kernel object backed FDs as
appropriate.

> Thanks,
> Song

^ permalink raw reply

* Re: [RFC PATCH v4 01/19] landlock: Support socket access-control
From: Mikhail Ivanov @ 2026-04-18 11:29 UTC (permalink / raw)
  To: Günther Noack
  Cc: mic, gnoack, willemdebruijn.kernel, matthieu,
	linux-security-module, netdev, netfilter-devel, yusongping,
	artem.kuzin, konstantin.meskhidze
In-Reply-To: <af464773-b01b-f3a4-474d-0efb2cfae142@huawei-partners.com>

On 11/22/2025 2:13 PM, Mikhail Ivanov wrote:
> On 11/22/2025 1:49 PM, Günther Noack wrote:
>> On Tue, Nov 18, 2025 at 09:46:21PM +0800, Mikhail Ivanov wrote:
>>> +/**
>>> + * struct landlock_socket_attr - Socket protocol definition
>>> + *
>>> + * Argument of sys_landlock_add_rule().
>>> + */
>>> +struct landlock_socket_attr {
>>> +    /**
>>> +     * @allowed_access: Bitmask of allowed access for a socket protocol
>>> +     * (cf. `Socket flags`_).
>>> +     */
>>> +    __u64 allowed_access;
>>> +    /**
>>> +     * @family: Protocol family used for communication
>>> +     * (cf. include/linux/socket.h).
>>> +     */
>>> +    __s32 family;
>>> +    /**
>>> +     * @type: Socket type (cf. include/linux/net.h)
>>> +     */
>>> +    __s32 type;
>>> +    /**
>>> +     * @protocol: Communication protocol specific to protocol family 
>>> set in
>>> +     * @family field.
>>
>> This is specific to both the @family and the @type, not just the @family.
>>
>>> From socket(2):
>>
>>    Normally only a single protocol exists to support a particular
>>    socket type within a given protocol family.
>>
>> For instance, in your commit message above the protocol in the example
>> is IPPROTO_TCP, which would imply the type SOCK_STREAM, but not work
>> with SOCK_DGRAM.
> 
> You're right.
> 

I revised the socket(2) semantics and this part is about that kernel
maps (family, type, 0) to the default protocol of given family and type.
Eg. (AF_INET, SOCK_STREAM, 0) is mapped to (AF_INET, SOCK_STREAM,
IPPROTO_TCP). I would like to clarify that such mapping is taking place
in landlock_socket_attr.protocol field doc.

There should be list of protocols defined per protocol family. From
socket(2):
	The domain argument specifies a communication domain.
	...
	The protocol number to use is specific to the “communication
	domain” in which communication is to take place.

Such mapping allows to define strange socket rules if setting @type=-1.
For example:
	struct landlock_socket_attr attr = {
		.family = AF_INET,
		.type = -1,
		.protocol = 0,
	};

This definition corresponds to (AF_INET, SOCK_STREAM, 0->IPPROTO_TCP)
and to (AF_INET, SOCK_DGRAM, 0->IPPROTO_UDP).

I don't see this as a bad thing as far as there is proper documentation
for landlock_socket_attr.

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Yeoreum Yun @ 2026-04-18 10:34 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will
In-Reply-To: <87se8sbozv.wl-maz@kernel.org>

Hi Marc,

> On Fri, 17 Apr 2026 18:57:59 +0100,
> Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> >
> > When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> > Otherwise, pKVM cannot negotiate the FF-A version or
> > obtain RX/TX buffer information, leading to failures in FF-A calls.
> >
> > During FF-A driver initialization, check whether pKVM has been initialized.
> > If not, defer probing of the FF-A driver.
> >
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > ---
> >  arch/arm64/kvm/arm.c              |  1 +
> >  drivers/firmware/arm_ffa/driver.c | 12 ++++++++++++
> >  2 files changed, 13 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 410ffd41fd73..0f517b1c05cd 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -119,6 +119,7 @@ bool is_kvm_arm_initialised(void)
> >  {
> >  	return kvm_arm_initialised;
> >  }
> > +EXPORT_SYMBOL(is_kvm_arm_initialised);
>
> EXPORT_SYMBOL_GPL(), please.

Okay.

>
> >
> >  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> >  {
> > diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c
> > index 02c76ac1570b..2647d6554afd 100644
> > --- a/drivers/firmware/arm_ffa/driver.c
> > +++ b/drivers/firmware/arm_ffa/driver.c
> > @@ -42,6 +42,8 @@
> >  #include <linux/uuid.h>
> >  #include <linux/xarray.h>
> >
> > +#include <asm/virt.h>
> > +
> >  #include "common.h"
> >
> >  #define FFA_DRIVER_VERSION	FFA_VERSION_1_2
> > @@ -2035,6 +2037,16 @@ static int __init ffa_init(void)
> >  	u32 buf_sz;
> >  	size_t rxtx_bufsz = SZ_4K;
> >
> > +	/*
> > +	 * When pKVM is enabled, the FF-A driver must be initialized
> > +	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
> > +	 * the FF-A version or obtain RX/TX buffer information,
> > +	 * which leads to failures in FF-A calls.
> > +	 */
> > +	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
> > +	    !is_kvm_arm_initialised())
> > +		return -EPROBE_DEFER;
> > +
>
> That's still fundamentally wrong: pkvm is not ready until
> finalize_pkvm() has finished, and that's not indicated by
> is_kvm_arm_initialised().

Thanks. I miss the TSC bit set in here.
IMHO, I'd like to make an new state check function --
is_pkvm_arm_initialised() so that ff-a driver to know whether
pkvm is initialised.

or any other suggestion?

Thanks.

--
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [RFC PATCH 4/4] firmware: arm_ffa: check pkvm initailised when initailise ffa driver
From: Marc Zyngier @ 2026-04-18  9:24 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity,
	linux-arm-kernel, kvmarm, paul, jmorris, serge, zohar,
	roberto.sassu, dmitry.kasatkin, eric.snowberg, peterhuewe, jarkko,
	jgg, sudeep.holla, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will
In-Reply-To: <20260417175759.3191279-5-yeoreum.yun@arm.com>

On Fri, 17 Apr 2026 18:57:59 +0100,
Yeoreum Yun <yeoreum.yun@arm.com> wrote:
> 
> When pKVM is enabled, the FF-A driver must be initialized after pKVM.
> Otherwise, pKVM cannot negotiate the FF-A version or
> obtain RX/TX buffer information, leading to failures in FF-A calls.
> 
> During FF-A driver initialization, check whether pKVM has been initialized.
> If not, defer probing of the FF-A driver.
> 
> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>  arch/arm64/kvm/arm.c              |  1 +
>  drivers/firmware/arm_ffa/driver.c | 12 ++++++++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 410ffd41fd73..0f517b1c05cd 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -119,6 +119,7 @@ bool is_kvm_arm_initialised(void)
>  {
>  	return kvm_arm_initialised;
>  }
> +EXPORT_SYMBOL(is_kvm_arm_initialised);

EXPORT_SYMBOL_GPL(), please.

> 
>  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>  {
> diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c
> index 02c76ac1570b..2647d6554afd 100644
> --- a/drivers/firmware/arm_ffa/driver.c
> +++ b/drivers/firmware/arm_ffa/driver.c
> @@ -42,6 +42,8 @@
>  #include <linux/uuid.h>
>  #include <linux/xarray.h>
> 
> +#include <asm/virt.h>
> +
>  #include "common.h"
> 
>  #define FFA_DRIVER_VERSION	FFA_VERSION_1_2
> @@ -2035,6 +2037,16 @@ static int __init ffa_init(void)
>  	u32 buf_sz;
>  	size_t rxtx_bufsz = SZ_4K;
> 
> +	/*
> +	 * When pKVM is enabled, the FF-A driver must be initialized
> +	 * after pKVM initialization. Otherwise, pKVM cannot negotiate
> +	 * the FF-A version or obtain RX/TX buffer information,
> +	 * which leads to failures in FF-A calls.
> +	 */
> +	if (IS_ENABLED(CONFIG_KVM) && is_protected_kvm_enabled() &&
> +	    !is_kvm_arm_initialised())
> +		return -EPROBE_DEFER;
> +

That's still fundamentally wrong: pkvm is not ready until
finalize_pkvm() has finished, and that's not indicated by
is_kvm_arm_initialised().

	M.

-- 
Jazz isn't dead. It just smells funny.

^ permalink raw reply

* Re: [PATCH v5 2/3] ima: trim N IMA event log records
From: steven chen @ 2026-04-17 21:26 UTC (permalink / raw)
  To: Roberto Sassu, linux-integrity
  Cc: zohar, roberto.sassu, dmitry.kasatkin, eric.snowberg, corbet,
	serge, paul, jmorris, linux-security-module, anirudhve,
	gregorylumen, nramas, sushring, linux-doc, steven chen
In-Reply-To: <b0b65c5a2d407301905dc4232eee4b16030920c8.camel@huaweicloud.com>

On 4/7/2026 9:19 AM, Roberto Sassu wrote:
> On Wed, 2026-04-01 at 10:29 -0700, steven chen wrote:
>> Trim N entries of the IMA event logs. Do not clean the hash table.
> The very first change of this patch is the kernel option
> ima_flush_htable option that I introduced for my use case.
>
> At the bottom of this patch you actually check the ima_flush_htable
> boolean, and delete the measurements entries without disconnecting them
> from the hash table, so the digest lookup is done on freed memory.
>
> Next, you duplicated my changes regarding the measurements list
> counter. But instead of removing the old counter from the hash table,
> you keep incrementing both, but use the new one.
>
> In ima_log_trim_open(), you use again my duplicated code to manage
> exclusive write/concurrent read scheme for the measurement interfaces.
> However, for read, if the process does not have CAP_SYS_ADMIN it falls
> back calling _ima_measurements_open(). Not sure it was intended.
Hi Roberto,

I acknowledged these are coming from you in my cover letter. Please
let me know the best way to show your contribution and I will update
in my next version.

All above issues you mentioned, I will update in next version.
> And, in ima_log_trim_release(), you check again CAP_SYS_ADMIN which is
> redundant, you would not reach this code if the same requirements were
> not met at open time. You also return an error on close().

Will update in next version.

Thanks,

> In ima_log_trim_write(), you do manual string to number conversion for
> your first number and use kstrtoul() for the second.
>
> The measurements lists and the associated counter are atomically
> updated in ima_add_digest_entry(), but not atomically accessed in
> ima_delete_event_log(). Also, the measurements list is traversed
> without _rcu variant or lock.

Will update in next version.

Thanks

>
> While this trimming scheme aims at minimizing the kernel space and user
> space delay, it also introduces the following problem. If two agents
> perform a TPM quote that include a different number of entries, there
> is no guarantee that the one willing to trim less entries wins. Which
> means that, one agent could end up not seeing the most recent entries,
> as they were already trimmed by the other agent.
This should be acceptable: the second trim request will be rejected and
the agent can find all logs in user space if all user agents handle the log
in the right way.
Also there is other way to do it: the user agent can hold the list by open
the ima_trim_log with write permission during reading, attestation, trim 
period.
In this way, the user agent for "Trim N method" will have similar user 
space hold time
as "staged method" but has less kernel list lock time, and user agent 
requirement
for "Trim N method" is much simple than that for "stage method".
>
> My solution is not affected by this problem, since there will be only
> one process collecting all the measurements in user space and exposing
> them to the agents.

Please see above response.

Thanks,

Steven

>
> Also, I didn't understand why T and ima_measure_users have to be
> preserved on soft reboots. Especially ima_measure_users reflects the
> state of open files for a particular kernel, but on soft reboot a new
> kernel is booted.
>
> I personally will not endorse a solution based on the ima_trim_log
> interface. I could accept trimming N even more efficiently than we
> currently do with a lockless walk to determine the cutting position in
> ima_queue_stage(), so that we don't need to splice back entries to the
> measurement list. This would be a replacement of patch 11 in my patch
> set, but this would be as far as I would like to go.
>
> Roberto
>
>> The values saved in hash table were already used.
>>
>> Provide a userspace interface ima_trim_log:
>> When read this interface, it returns total number T of entries trimmed
>> since system boot up.
>> When write to this interface need to provide two numbers T:N to let
>> kernel to trim N entries of IMA event logs.
>>
>> Kernel measurement list lock time performance improvement by not
>> clean the hash table.
>>
>> when kernel get log trim request T:N
>>   - Get the T, compare with the total trimmed number
>>   - if equal, then do trim N and change T to T+N
>>   - else return error
>>
>> Signed-off-by: steven chen <chenste@linux.microsoft.com>
>> ---
>>   .../admin-guide/kernel-parameters.txt         |   4 +
>>   security/integrity/ima/ima.h                  |   4 +-
>>   security/integrity/ima/ima_fs.c               | 198 +++++++++++++++++-
>>   security/integrity/ima/ima_kexec.c            |   2 +-
>>   security/integrity/ima/ima_queue.c            |  96 +++++++++
>>   5 files changed, 296 insertions(+), 8 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index e92c0056e4e0..cd1a1d0bf0e2 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -2197,6 +2197,10 @@
>>   			Use the canonical format for the binary runtime
>>   			measurements, instead of host native format.
>>   
>> +	ima_flush_htable  [IMA]
>> +			Flush the measurement list hash table when trim all
>> +			or a part of it for deletion.
>> +
>>   	ima_hash=	[IMA]
>>   			Format: { md5 | sha1 | rmd160 | sha256 | sha384
>>   				   | sha512 | ... }
>> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
>> index e3d71d8d56e3..5cbee3a295a0 100644
>> --- a/security/integrity/ima/ima.h
>> +++ b/security/integrity/ima/ima.h
>> @@ -243,11 +243,13 @@ void ima_post_key_create_or_update(struct key *keyring, struct key *key,
>>   				   const void *payload, size_t plen,
>>   				   unsigned long flags, bool create);
>>   #endif
>> -
>> +extern atomic_long_t ima_number_entries;
>>   #ifdef CONFIG_IMA_KEXEC
>>   void ima_measure_kexec_event(const char *event_name);
>> +long ima_delete_event_log(long req_val);
>>   #else
>>   static inline void ima_measure_kexec_event(const char *event_name) {}
>> +static inline long ima_delete_event_log(long req_val) { return 0; }
>>   #endif
>>   
>>   /*
>> diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
>> index 87045b09f120..8e26e0f34311 100644
>> --- a/security/integrity/ima/ima_fs.c
>> +++ b/security/integrity/ima/ima_fs.c
>> @@ -21,6 +21,9 @@
>>   #include <linux/rcupdate.h>
>>   #include <linux/parser.h>
>>   #include <linux/vmalloc.h>
>> +#include <linux/ktime.h>
>> +#include <linux/timekeeping.h>
>> +#include <linux/ima.h>
>>   
>>   #include "ima.h"
>>   
>> @@ -38,6 +41,17 @@ __setup("ima_canonical_fmt", default_canonical_fmt_setup);
>>   
>>   static int valid_policy = 1;
>>   
>> +#define IMA_LOG_TRIM_REQ_NUM_LENGTH 15
>> +#define IMA_LOG_TRIM_REQ_TOTAL_LENGTH 32
>> +atomic_long_t ima_number_entries = ATOMIC_LONG_INIT(0);
>> +static long trimcount;
>> +/* mutex protects atomicity of trimming measurement list
>> + * and also protects atomicity the measurement list read
>> + * write operation.
>> + */
>> +static DEFINE_MUTEX(ima_measure_lock);
>> +static long ima_measure_users;
>> +
>>   static ssize_t ima_show_htable_value(char __user *buf, size_t count,
>>   				     loff_t *ppos, atomic_long_t *val)
>>   {
>> @@ -64,8 +78,7 @@ static ssize_t ima_show_measurements_count(struct file *filp,
>>   					   char __user *buf,
>>   					   size_t count, loff_t *ppos)
>>   {
>> -	return ima_show_htable_value(buf, count, ppos, &ima_htable.len);
>> -
>> +	return ima_show_htable_value(buf, count, ppos, &ima_number_entries);
>>   }
>>   
>>   static const struct file_operations ima_measurements_count_ops = {
>> @@ -202,16 +215,77 @@ static const struct seq_operations ima_measurments_seqops = {
>>   	.show = ima_measurements_show
>>   };
>>   
>> +/*
>> + * _ima_measurements_open - open the IMA measurements file
>> + * @inode: inode of the file being opened
>> + * @file: file being opened
>> + * @seq_ops: sequence operations for the file
>> + *
>> + * Returns 0 on success, or negative error code.
>> + * Implements mutual exclusion between readers and writer
>> + * of the measurements file. Multiple readers are allowed,
>> + * but writer get exclusive access only no other readers/writers.
>> + * Readers is not allowed when there is a writer.
>> + */
>> +static int _ima_measurements_open(struct inode *inode, struct file *file,
>> +				  const struct seq_operations *seq_ops)
>> +{
>> +	bool write = !!(file->f_mode & FMODE_WRITE);
>> +	int ret;
>> +
>> +	if (write && !capable(CAP_SYS_ADMIN))
>> +		return -EPERM;
>> +
>> +	mutex_lock(&ima_measure_lock);
>> +	if ((write && ima_measure_users != 0) ||
>> +	    (!write && ima_measure_users < 0)) {
>> +		mutex_unlock(&ima_measure_lock);
>> +		return -EBUSY;
>> +	}
>> +
>> +	ret = seq_open(file, seq_ops);
>> +	if (ret < 0) {
>> +		mutex_unlock(&ima_measure_lock);
>> +		return ret;
>> +	}
>> +
>> +	if (write)
>> +		ima_measure_users--;
>> +	else
>> +		ima_measure_users++;
>> +
>> +	mutex_unlock(&ima_measure_lock);
>> +	return ret;
>> +}
>> +
>>   static int ima_measurements_open(struct inode *inode, struct file *file)
>>   {
>> -	return seq_open(file, &ima_measurments_seqops);
>> +	return _ima_measurements_open(inode, file, &ima_measurments_seqops);
>> +}
>> +
>> +static int ima_measurements_release(struct inode *inode, struct file *file)
>> +{
>> +	bool write = !!(file->f_mode & FMODE_WRITE);
>> +	int ret;
>> +
>> +	mutex_lock(&ima_measure_lock);
>> +	ret = seq_release(inode, file);
>> +	if (!ret) {
>> +		if (!write)
>> +			ima_measure_users--;
>> +		else
>> +			ima_measure_users++;
>> +	}
>> +
>> +	mutex_unlock(&ima_measure_lock);
>> +	return ret;
>>   }
>>   
>>   static const struct file_operations ima_measurements_ops = {
>>   	.open = ima_measurements_open,
>>   	.read = seq_read,
>>   	.llseek = seq_lseek,
>> -	.release = seq_release,
>> +	.release = ima_measurements_release,
>>   };
>>   
>>   void ima_print_digest(struct seq_file *m, u8 *digest, u32 size)
>> @@ -279,14 +353,114 @@ static const struct seq_operations ima_ascii_measurements_seqops = {
>>   
>>   static int ima_ascii_measurements_open(struct inode *inode, struct file *file)
>>   {
>> -	return seq_open(file, &ima_ascii_measurements_seqops);
>> +	return _ima_measurements_open(inode, file, &ima_ascii_measurements_seqops);
>>   }
>>   
>>   static const struct file_operations ima_ascii_measurements_ops = {
>>   	.open = ima_ascii_measurements_open,
>>   	.read = seq_read,
>>   	.llseek = seq_lseek,
>> -	.release = seq_release,
>> +	.release = ima_measurements_release,
>> +};
>> +
>> +static int ima_log_trim_open(struct inode *inode, struct file *file)
>> +{
>> +	bool write = !!(file->f_mode & FMODE_WRITE);
>> +
>> +	if (!write && capable(CAP_SYS_ADMIN))
>> +		return 0;
>> +	else if (!capable(CAP_SYS_ADMIN))
>> +		return -EPERM;
>> +
>> +	return _ima_measurements_open(inode, file, &ima_measurments_seqops);
>> +}
>> +
>> +static ssize_t ima_log_trim_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
>> +{
>> +	char tmpbuf[IMA_LOG_TRIM_REQ_NUM_LENGTH];
>> +	ssize_t len;
>> +
>> +	len = scnprintf(tmpbuf, sizeof(tmpbuf), "%li\n", trimcount);
>> +	return simple_read_from_buffer(buf, size, ppos, tmpbuf, len);
>> +}
>> +
>> +static ssize_t ima_log_trim_write(struct file *file,
>> +				  const char __user *buf, size_t datalen, loff_t *ppos)
>> +{
>> +	char tmpbuf[IMA_LOG_TRIM_REQ_TOTAL_LENGTH];
>> +	char *p = tmpbuf;
>> +	long count, ret, val = 0, max = LONG_MAX;
>> +
>> +	if (*ppos > 0 || datalen > IMA_LOG_TRIM_REQ_TOTAL_LENGTH || datalen < 2) {
>> +		ret = -EINVAL;
>> +		goto out;
>> +	}
>> +
>> +	if (copy_from_user(tmpbuf, buf, datalen) != 0) {
>> +		ret = -EFAULT;
>> +		goto out;
>> +	}
>> +
>> +	p = tmpbuf;
>> +
>> +	while (*p && *p != ':') {
>> +		if (!isdigit((unsigned char)*p))
>> +			return -EINVAL;
>> +
>> +		/* digit value */
>> +		int d = *p - '0';
>> +
>> +		/* overflow check: val * 10 + d > max -> (val > (max - d) / 10) */
>> +		if (val > (max - d) / 10)
>> +			return -ERANGE;
>> +
>> +		val = val * 10 + d;
>> +		p++;
>> +	}
>> +
>> +	if (*p != ':')
>> +		return -EINVAL;
>> +
>> +	/* verify trim count matches */
>> +	if (val != trimcount)
>> +		return -EINVAL;
>> +
>> +	p++; /* skip ':' */
>> +	ret = kstrtoul(p, 0, &count);
>> +
>> +	if (ret < 0)
>> +		goto out;
>> +
>> +	ret = ima_delete_event_log(count);
>> +
>> +	if (ret < 0)
>> +		goto out;
>> +
>> +	trimcount += ret;
>> +
>> +	ret = datalen;
>> +out:
>> +	return ret;
>> +}
>> +
>> +static int ima_log_trim_release(struct inode *inode, struct file *file)
>> +{
>> +	bool write = !!(file->f_mode & FMODE_WRITE);
>> +
>> +	if (!write && capable(CAP_SYS_ADMIN))
>> +		return 0;
>> +	else if (!capable(CAP_SYS_ADMIN))
>> +		return -EPERM;
>> +
>> +	return ima_measurements_release(inode, file);
>> +}
>> +
>> +static const struct file_operations ima_log_trim_ops = {
>> +	.open = ima_log_trim_open,
>> +	.read = ima_log_trim_read,
>> +	.write = ima_log_trim_write,
>> +	.llseek = generic_file_llseek,
>> +	.release = ima_log_trim_release
>>   };
>>   
>>   static ssize_t ima_read_policy(char *path)
>> @@ -528,6 +702,18 @@ int __init ima_fs_init(void)
>>   		goto out;
>>   	}
>>   
>> +	if (IS_ENABLED(CONFIG_IMA_LOG_TRIMMING)) {
>> +		dentry = securityfs_create_file("ima_trim_log",
>> +						S_IRUSR | S_IRGRP | S_IWUSR | S_IWGRP,
>> +						ima_dir, NULL, &ima_log_trim_ops);
>> +		if (IS_ERR(dentry)) {
>> +			ret = PTR_ERR(dentry);
>> +			goto out;
>> +		}
>> +	}
>> +
>> +	trimcount = 0;
>> +
>>   	dentry = securityfs_create_file("runtime_measurements_count",
>>   				   S_IRUSR | S_IRGRP, ima_dir, NULL,
>>   				   &ima_measurements_count_ops);
>> diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
>> index 7362f68f2d8b..bee997683e03 100644
>> --- a/security/integrity/ima/ima_kexec.c
>> +++ b/security/integrity/ima/ima_kexec.c
>> @@ -41,7 +41,7 @@ void ima_measure_kexec_event(const char *event_name)
>>   	int n;
>>   
>>   	buf_size = ima_get_binary_runtime_size();
>> -	len = atomic_long_read(&ima_htable.len);
>> +	len = atomic_long_read(&ima_number_entries);
>>   
>>   	n = scnprintf(ima_kexec_event, IMA_KEXEC_EVENT_LEN,
>>   		      "kexec_segment_size=%lu;ima_binary_runtime_size=%lu;"
>> diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
>> index 590637e81ad1..07225e19b9b5 100644
>> --- a/security/integrity/ima/ima_queue.c
>> +++ b/security/integrity/ima/ima_queue.c
>> @@ -22,6 +22,14 @@
>>   
>>   #define AUDIT_CAUSE_LEN_MAX 32
>>   
>> +bool ima_flush_htable;
>> +static int __init ima_flush_htable_setup(char *str)
>> +{
>> +	ima_flush_htable = true;
>> +	return 1;
>> +}
>> +__setup("ima_flush_htable", ima_flush_htable_setup);
>> +
>>   /* pre-allocated array of tpm_digest structures to extend a PCR */
>>   static struct tpm_digest *digests;
>>   
>> @@ -114,6 +122,7 @@ static int ima_add_digest_entry(struct ima_template_entry *entry,
>>   	list_add_tail_rcu(&qe->later, &ima_measurements);
>>   
>>   	atomic_long_inc(&ima_htable.len);
>> +	atomic_long_inc(&ima_number_entries);
>>   	if (update_htable) {
>>   		key = ima_hash_key(entry->digests[ima_hash_algo_idx].digest);
>>   		hlist_add_head_rcu(&qe->hnext, &ima_htable.queue[key]);
>> @@ -220,6 +229,93 @@ int ima_add_template_entry(struct ima_template_entry *entry, int violation,
>>   	return result;
>>   }
>>   
>> +/**
>> + * ima_delete_event_log - delete IMA event entry
>> + * @num_records: number of records to delete
>> + *
>> + * delete num_records entries off the measurement list.
>> + * Returns num_records, or negative error code.
>> + */
>> +long ima_delete_event_log(long num_records)
>> +{
>> +	long len, cur = num_records, tmp_len = 0;
>> +	struct ima_queue_entry *qe, *qe_tmp;
>> +	LIST_HEAD(ima_measurements_to_delete);
>> +	struct list_head *list_ptr;
>> +
>> +	if (!IS_ENABLED(CONFIG_IMA_LOG_TRIMMING))
>> +		return -EOPNOTSUPP;
>> +
>> +	if (num_records <= 0)
>> +		return num_records;
>> +
>> +	list_ptr = &ima_measurements;
>> +
>> +	len = atomic_long_read(&ima_number_entries);
>> +
>> +	if (num_records <= len) {
>> +		list_for_each_entry(qe, list_ptr, later) {
>> +			if (cur > 0) {
>> +				tmp_len += get_binary_runtime_size(qe->entry);
>> +				--cur;
>> +			}
>> +			if (cur == 0) {
>> +				qe_tmp = qe;
>> +				break;
>> +			}
>> +		}
>> +	}
>> +	else {
>> +		return -ENOENT;
>> +	}
>> +
>> +
>> +	mutex_lock(&ima_extend_list_mutex);
>> +	len = atomic_long_read(&ima_number_entries);
>> +
>> +	if (num_records == len) {
>> +		list_replace(&ima_measurements, &ima_measurements_to_delete);
>> +		INIT_LIST_HEAD(&ima_measurements);
>> +		atomic_long_set(&ima_number_entries, 0);
>> +		list_ptr = &ima_measurements_to_delete;
>> +	}
>> +	else {
>> +		__list_cut_position(&ima_measurements_to_delete, &ima_measurements,
>> +				    &qe_tmp->later);
>> +		atomic_long_sub(num_records, &ima_number_entries);
>> +		if (IS_ENABLED(CONFIG_IMA_KEXEC))
>> +			binary_runtime_size -= tmp_len;
>> +	}
>> +
>> +	mutex_unlock(&ima_extend_list_mutex);
>> +
>> +	if (ima_flush_htable)
>> +		synchronize_rcu();
>> +
>> +	list_for_each_entry_safe(qe, qe_tmp, &ima_measurements_to_delete, later) {
>> +		/*
>> +		 * Ok because after list delete qe is only accessed by
>> +		 * ima_lookup_digest_entry().
>> +		 */
>> +		for (int i = 0; i < qe->entry->template_desc->num_fields; i++) {
>> +			kfree(qe->entry->template_data[i].data);
>> +			qe->entry->template_data[i].data = NULL;
>> +			qe->entry->template_data[i].len = 0;
>> +		}
>> +
>> +		list_del(&qe->later);
>> +
>> +		/* No leak if !ima_flush_htable, referenced by ima_htable. */
>> +		if (ima_flush_htable) {
>> +			kfree(qe->entry->digests);
>> +			kfree(qe->entry);
>> +			kfree(qe);
>> +		}
>> +	}
>> +
>> +	return num_records;
>> +}
>> +
>>   int ima_restore_measurement_entry(struct ima_template_entry *entry)
>>   {
>>   	int result = 0;



^ permalink raw reply

* Re: [RFC PATCH 08/20] bpf: Add Landlock ruleset map type
From: Song Liu @ 2026-04-17 20:42 UTC (permalink / raw)
  To: Justin Suess
  Cc: Mickaël Salaün, ast, daniel, andrii, kpsingh, paul,
	viro, brauner, kees, gnoack, jack, jmorris, serge, yonghong.song,
	martin.lau, m, eddyz87, john.fastabend, sdf, skhan, bpf,
	linux-security-module, linux-kernel, linux-fsdevel
In-Reply-To: <aeKY_QUge4okHjrW@suesslenovo>

On Fri, Apr 17, 2026 at 1:33 PM Justin Suess <utilityemal77@gmail.com> wrote:
[...]
> > > > to the caller) and pass them as file descriptor?
> > > This "pass them as a file descriptor" is the tricky part. It would be
> > > very convenient if we could send the fd to bpf from userspace and have
> > > it be implicitly converted (like in the BPF_MAP_TYPE_LANDLOCK_RULESET
> > > implementation) in one step, but I just don't see a way to do that with
> > > the bpf_landlock_get_ruleset_from_fd kfunc approach.
> >
> > Song's idea to have a generic FD map looks promising.
> >
>
> I agree the generic FD map sounds like a good fit.

Well, I am not 100% sure a generic FD map adds enough value
on top of current __kptr solutions. This will be more tricky if we
have to touch file_operations.

> So this would be three parts like:
>
> 1. The new point-of-no-return flags for NNP and staging domain to
> execution time in Landlock. Selftests and doc updates.
> 2. The generic FD map implementation for bpf. Selftests and doc updates.
> 3. The BPF kfunc implementations for Landlock using the same point-of-no
> return staging. Selftests and doc updates.
>
> The scope of which is probably too big for one series.
>
> Luckily part 1 is pretty close to being done as part of my work for v2
> of this series, and can standalone as a preparatory series for Landlock,
> since it adds flags and features that have utility outside of BPF.
>
> Open for ideas on how to split this up (or even better, for some help in
> implementation or prior works).
>
> I'd like to get some feedback and figue out what this generic fd map
> should look like and get some more eyes on that idea to avoid wasting
> reviewer time on an unsuitable implementation.

I will think more about 2. If it indeed adds good value, the upcoming
LSF/MM/BPF is a good opportunity to move this forward.

In the meanwhile, we still need kfuncs to access landlock ruleset.
Therefore, any work on that front should be useful.

Thanks,
Song

^ permalink raw reply

* Re: [RFC PATCH 08/20] bpf: Add Landlock ruleset map type
From: Justin Suess @ 2026-04-17 20:33 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Song Liu, ast, daniel, andrii, kpsingh, paul, viro, brauner, kees,
	gnoack, jack, jmorris, serge, yonghong.song, martin.lau, m,
	eddyz87, john.fastabend, sdf, skhan, bpf, linux-security-module,
	linux-kernel, linux-fsdevel
In-Reply-To: <20260417.aPh1ooQu8esh@digikod.net>

On Fri, Apr 17, 2026 at 08:03:14PM +0200, Mickaël Salaün wrote:
> On Fri, Apr 17, 2026 at 12:51:40PM -0400, Justin Suess wrote:
> > On Fri, Apr 17, 2026 at 05:18:05PM +0200, Mickaël Salaün wrote:
> > > On Fri, Apr 17, 2026 at 10:09:13AM -0400, Justin Suess wrote:
> > > > On Thu, Apr 16, 2026 at 04:47:40PM -0700, Song Liu wrote:
> > > > > On Thu, Apr 16, 2026 at 2:53 PM Justin Suess <utilityemal77@gmail.com> wrote:
> > [...]
> > > Why not using proper typing with a dedicated map?
> > > 
> > 
> > I may be misunderstanding, but from what I see, a __kptr DOES give
> > proper typing, __kptr is an annotation not a type.
> 
> Ok, good.
> 
> > 
> > This is what it would look like in an BPF_MAP_TYPE_ARRAY.
> > 
> >     struct ruleset_kptr_value {
> > 	    struct bpf_landlock_ruleset __kptr * ruleset;
> >     };
> > 
> >     struct {
> > 	      __uint(type, BPF_MAP_TYPE_ARRAY);
> > 	      __uint(max_entries, 1);
> > 	      __type(key, __u32);
> > 	      __type(value, struct ruleset_kptr_value);
> >     } ruleset_kptr_map SEC(".maps");
> > 
> > So we get proper typing from what I see. (It's not like a __kptr is a
> > special void*, it has a type)
> 
> Looks good.
> 
> [...]
> > 
> > The answer the the lifetime part is yes.
> > 
> > The kptr destructors and the landlock ruleset refcounting give us that
> > abstraction. (along with the KF_ACQUIRE/KF_RELEASE annotations and
> > destructor implementation)
> 
> Good.
> 
> > 
> > > to the caller) and pass them as file descriptor?
> > This "pass them as a file descriptor" is the tricky part. It would be
> > very convenient if we could send the fd to bpf from userspace and have
> > it be implicitly converted (like in the BPF_MAP_TYPE_LANDLOCK_RULESET
> > implementation) in one step, but I just don't see a way to do that with
> > the bpf_landlock_get_ruleset_from_fd kfunc approach.
> 
> Song's idea to have a generic FD map looks promising.
> 

I agree the generic FD map sounds like a good fit.

So this would be three parts like:

1. The new point-of-no-return flags for NNP and staging domain to
execution time in Landlock. Selftests and doc updates.
2. The generic FD map implementation for bpf. Selftests and doc updates.
3. The BPF kfunc implementations for Landlock using the same point-of-no
return staging. Selftests and doc updates.

The scope of which is probably too big for one series.

Luckily part 1 is pretty close to being done as part of my work for v2
of this series, and can standalone as a preparatory series for Landlock,
since it adds flags and features that have utility outside of BPF.

Open for ideas on how to split this up (or even better, for some help in
implementation or prior works).

I'd like to get some feedback and figue out what this generic fd map
should look like and get some more eyes on that idea to avoid wasting
reviewer time on an unsuitable implementation.

Justin

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox