* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Jonathan McDowell @ 2026-04-23 12:53 UTC (permalink / raw)
To: Yeoreum Yun
Cc: Mimi Zohar, linux-security-module, linux-kernel, linux-integrity,
linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
will, noodles, sebastianene
In-Reply-To: <aeoRxWPyOHGJd+Jh@e129823.arm.com>
On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
>> > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
>> > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
>> > > > > > Hi Mimi,
>> > > > > >
>> > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
>> > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
>> > > > > > > > the TPM driver must be built as built-in and
>> > > > > > > > must be probed before the IMA subsystem is initialized.
>> > > > > > > >
>> > > > > > > > However, when the TPM device operates over the FF-A protocol using
>> > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
>> > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
>> > > > > > > > interface to the tpm_crb driver — has not yet been probed.
>> > > > > > > >
>> > > > > > > > To ensure the TPM device operating over the FF-A protocol with
>> > > > > > > > the CRB interface is probed before IMA initialization,
>> > > > > > > > the following conditions must be met:
>> > > > > > > >
>> > > > > > > > 1. The corresponding ffa_device must be registered,
>> > > > > > > > which is done via ffa_init().
>> > > > > > > >
>> > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
>> > > > > > > > tpm_crb_ffa_init().
>> > > > > > > >
>> > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
>> > > > > > > > be probed successfully. (See crb_acpi_add() and
>> > > > > > > > tpm_crb_ffa_init() for reference.)
>> > > > > > > >
>> > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
>> > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
>> > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
>> > > > > > > >
>> > > > > > > > When this occurs, probing the TPM device is deferred.
>> > > > > > > > However, the deferred probe can happen after the IMA subsystem
>> > > > > > > > has already been initialized, since IMA initialization is performed
>> > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
>> > > > > > > > at the same level.
>> > > > > > > >
>> > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
>> > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
>> > > > > > > > log though TPM device presents in the system.
>> > > > > > > >
>> > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
>> > > > > > >
>> > > > > > > A lot of change for just detecting whether ima_init() is being called on
>> > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
>> > > > > > > changes (e.g. ima_init_core).
>> > > > > > >
>> > > > > > > Please just limit the change to just calling ima_init() twice.
>> > > > > >
>> > > > > > My concern is that ima_update_policy_flags() will be called
>> > > > > > when ima_init() is deferred -- not initialised anything.
>> > > > > > though functionally, it might be okay however,
>> > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
>> > > > > > works logically.
>> > > > > >
>> > > > > > This change I think not much quite a lot. just wrapper ima_init() with
>> > > > > > ima_init_core() with some error handling.
>> > > > > >
>> > > > > > Am I missing something?
>> > > > >
>> > > > > Also, if we handle in ima_init() only, but it failed with other reason,
>> > > > > we shouldn't call again ima_init() in the late_initcall_sync.
>> > > > >
>> > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
>> > > > > it by caller of ima_init().
>> > > >
>> > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
>> > > > instead of going into TPM-bypass mode, return immediately. There are no calls
>> > > > to anything else. Just call ima_init() a second time.
>> > >
>> > > I’m not fully convinced this is sufficient.
>> > >
>> > > What I meant is the case where ima_init() fails due to other
>> > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
>> >
>> > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
>> > available at late_initcall. This would be classified as a bug fix and would be
>> > backported. No other changes should be included in this patch.
>>
>> Okay.
>>
>> > >
>> > > I’d also like to ask again whether it is fine to call
>> > > ima_update_policy_flags() and keep the notifier registered in the
>> > > deferred TPM case. While this may be functionally acceptable, it seems
>> > > logically questionable to do so when ima_init() has not completed.
>> >
>> > Other than extending the TPM, IMA should behave exactly the same whether there
>> > is a TPM or goes into TPM-bypass mode.
>> >
>> > >
>> > > There is also a possibility that a deferred case ultimately fails (e.g.
>> > > deferred at late_initcall, but then failing at late_initcall_sync
>> > > for another reason, even while entering TPM bypass mode). In that case,
>> > > it seems more appropriate to handle this state in the caller of
>> > > ima_init(), rather than inside ima_init() itself.
>> >
>> > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
>> > bypass mode. Please don't make any other changes to the existing IMA behavior
>> > and hide it here behind the late_initcall_sync change.
>>
>> Okay. you're talking called ima_update_policy_flags() at late_initcall
>> wouldn't be not a problem even in case of late_initcall_sync's ima_init()
>> get failed with "TPM-bypass mode".
>>
>> I see then, I'll make a patch simpler then.
>
>But I think in case of below situation:
> - late_initcall's first ima_init() is deferred.
> - late_initcall_sync try again but failed and try again with
> CONFIG_IMA_DEFAULT_HASH.
>
>I would like to sustain init_ima_core to reduce the same code repeat
>in late_initcall_sync.
I think what Mimi's proposing is:
If we're in late_initcall, and the TPM isn't available, return
immediately with an error (the EPROBE_DEFER?), don't do any init.
If we're in late_initcall_sync, either we're already initialised, so do
return and nothing, or run through the entire flow, even if the TPM
isn't unavailable.
So ima_init() just needs to know a) if it's in the sync or non-sync mode
and b) for the sync mode, if we've already done the init at
non-sync.
J.
--
... I'm not popular enough to be different.
^ permalink raw reply
* Re: [PATCH v2 0/4] Firmware LSM hook
From: Leon Romanovsky @ 2026-04-23 13:05 UTC (permalink / raw)
To: Paul Moore
Cc: Jason Gunthorpe, Roberto Sassu, KP Singh, Matt Bobrowski,
Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
Saeed Mahameed, Itay Avraham, Dave Jiang, Jonathan Cameron, bpf,
linux-kernel, linux-kselftest, linux-rdma, Chiara Meiohas,
Maher Sanalla, linux-security-module
In-Reply-To: <CAHC9VhSECYihup=tURo_Qk__xUdYYPkHgnz5CWA0BrRAkvwbog@mail.gmail.com>
On Wed, Apr 15, 2026 at 05:40:04PM -0400, Paul Moore wrote:
> On Wed, Apr 15, 2026 at 9:47 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > On Tue, Apr 14, 2026 at 04:27:58PM -0400, Paul Moore wrote:
> > > On Mon, Apr 13, 2026 at 7:19 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > > On Mon, Apr 13, 2026 at 06:36:06PM -0400, Paul Moore wrote:
> > > > > On Mon, Apr 13, 2026 at 12:42 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > > > > On Sun, Apr 12, 2026 at 09:38:35PM -0400, Paul Moore wrote:
>
> ...
<...>
> > > > > so that only the firmware would need to parse the request. If we
> > > > > wanted to adopt a secmark-esque approach, one could develop a second
> > > > > parsing mechanism that would be responsible for assigning a LSM label
> > > > > to the request, and then pass the firmware request to the LSM, but I
> > > > > do worry a bit about the added complexity associated with keeping the
> > > > > parser sync'd with the driver/fw.
> > > >
> > > > In practice it would be like iptables, the parser would be entirely
> > > > programmed by userspace and there is nothing to keep in sync.
> > >
> > > You've mentioned a few times now that the firmware/request will vary
> > > across not only devices, but firmware revisions too,
> >
> > I never said firmware revisions, part of the requirement is strong ABI
> > compatability in these packets.
>
> That was my mistake; it was Leon.
>
> Leon mentioned that different firmware revisions would have different
> parameters for a given opcode, and that one would need to inspect
> those parameters to properly filter the command. Is that not true, or
> am I misreading or misunderstanding Leon's comments?
>
> https://lore.kernel.org/all/20260310175759.GD12611@unreal
Right, I said that. The mlx5–FW interface is stable, but that does not
mean it can never change. The contract is that any upstream driver
release must continue to operate correctly with released firmware.
To support this, there are cases where the driver and firmware
negotiate during device initialization to determine whether a given
feature is supported and specific maibox fields are valid.
Thanks
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Yeoreum Yun @ 2026-04-23 13:07 UTC (permalink / raw)
To: Jonathan McDowell
Cc: Mimi Zohar, linux-security-module, linux-kernel, linux-integrity,
linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
will, noodles, sebastianene
In-Reply-To: <aeoWO2Cwo04YYu2l@earth.li>
Hi,
> > > > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > > > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > > > > > Hi Mimi,
> > > > > > > >
> > > > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > > > > > the TPM driver must be built as built-in and
> > > > > > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > > > > >
> > > > > > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > > > > >
> > > > > > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > > > > > the CRB interface is probed before IMA initialization,
> > > > > > > > > > the following conditions must be met:
> > > > > > > > > >
> > > > > > > > > > 1. The corresponding ffa_device must be registered,
> > > > > > > > > > which is done via ffa_init().
> > > > > > > > > >
> > > > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
> > > > > > > > > > tpm_crb_ffa_init().
> > > > > > > > > >
> > > > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
> > > > > > > > > > be probed successfully. (See crb_acpi_add() and
> > > > > > > > > > tpm_crb_ffa_init() for reference.)
> > > > > > > > > >
> > > > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > > > > >
> > > > > > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > > > > > has already been initialized, since IMA initialization is performed
> > > > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > > > > > at the same level.
> > > > > > > > > >
> > > > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > > > > > log though TPM device presents in the system.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > > > > >
> > > > > > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > > > > > changes (e.g. ima_init_core).
> > > > > > > > >
> > > > > > > > > Please just limit the change to just calling ima_init() twice.
> > > > > > > >
> > > > > > > > My concern is that ima_update_policy_flags() will be called
> > > > > > > > when ima_init() is deferred -- not initialised anything.
> > > > > > > > though functionally, it might be okay however,
> > > > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > > > > > works logically.
> > > > > > > >
> > > > > > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > > > > > ima_init_core() with some error handling.
> > > > > > > >
> > > > > > > > Am I missing something?
> > > > > > >
> > > > > > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > > > > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > > > > >
> > > > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > > > > > it by caller of ima_init().
> > > > > >
> > > > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
> > > > > > instead of going into TPM-bypass mode, return immediately. There are no calls
> > > > > > to anything else. Just call ima_init() a second time.
> > > > >
> > > > > I’m not fully convinced this is sufficient.
> > > > >
> > > > > What I meant is the case where ima_init() fails due to other
> > > > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
> > > >
> > > > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
> > > > available at late_initcall. This would be classified as a bug fix and would be
> > > > backported. No other changes should be included in this patch.
> > >
> > > Okay.
> > >
> > > > >
> > > > > I’d also like to ask again whether it is fine to call
> > > > > ima_update_policy_flags() and keep the notifier registered in the
> > > > > deferred TPM case. While this may be functionally acceptable, it seems
> > > > > logically questionable to do so when ima_init() has not completed.
> > > >
> > > > Other than extending the TPM, IMA should behave exactly the same whether there
> > > > is a TPM or goes into TPM-bypass mode.
> > > >
> > > > >
> > > > > There is also a possibility that a deferred case ultimately fails (e.g.
> > > > > deferred at late_initcall, but then failing at late_initcall_sync
> > > > > for another reason, even while entering TPM bypass mode). In that case,
> > > > > it seems more appropriate to handle this state in the caller of
> > > > > ima_init(), rather than inside ima_init() itself.
> > > >
> > > > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
> > > > bypass mode. Please don't make any other changes to the existing IMA behavior
> > > > and hide it here behind the late_initcall_sync change.
> > >
> > > Okay. you're talking called ima_update_policy_flags() at late_initcall
> > > wouldn't be not a problem even in case of late_initcall_sync's ima_init()
> > > get failed with "TPM-bypass mode".
> > >
> > > I see then, I'll make a patch simpler then.
> >
> > But I think in case of below situation:
> > - late_initcall's first ima_init() is deferred.
> > - late_initcall_sync try again but failed and try again with
> > CONFIG_IMA_DEFAULT_HASH.
> >
> > I would like to sustain init_ima_core to reduce the same code repeat
> > in late_initcall_sync.
>
> I think what Mimi's proposing is:
>
> If we're in late_initcall, and the TPM isn't available, return immediately
> with an error (the EPROBE_DEFER?), don't do any init.
>
> If we're in late_initcall_sync, either we're already initialised, so do
> return and nothing, or run through the entire flow, even if the TPM isn't
> unavailable.
>
> So ima_init() just needs to know a) if it's in the sync or non-sync mode and
> b) for the sync mode, if we've already done the init at
> non-sync.
But think think about when "late_initcall_sync" happens.
In case of it, whether TPM present or by-pass mode, if it failed,
it try again with the DEFAULT_HASH if hash isn't use DEFAULT one
(e.x. user set boot arguments hash_setup=md5).
IOW, late_initcall_sync should call twice just like former code do this.
I mean to wrap this duplication of code with init_core_ima().
so that int late_initcall_sync in case of deferred case to try agina
ima_init() with the DEFAULT HASH.
--
Sincerely,
Yeoreum Yun
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-23 13:43 UTC (permalink / raw)
To: Jonathan McDowell, Yeoreum Yun
Cc: linux-security-module, linux-kernel, linux-integrity,
linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
will, noodles, sebastianene
In-Reply-To: <aeoWO2Cwo04YYu2l@earth.li>
On Thu, 2026-04-23 at 13:53 +0100, Jonathan McDowell wrote:
> On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
> > > > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > > > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > > > > > Hi Mimi,
> > > > > > > >
> > > > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > > > > > the TPM driver must be built as built-in and
> > > > > > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > > > > >
> > > > > > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > > > > >
> > > > > > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > > > > > the CRB interface is probed before IMA initialization,
> > > > > > > > > > the following conditions must be met:
> > > > > > > > > >
> > > > > > > > > > 1. The corresponding ffa_device must be registered,
> > > > > > > > > > which is done via ffa_init().
> > > > > > > > > >
> > > > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
> > > > > > > > > > tpm_crb_ffa_init().
> > > > > > > > > >
> > > > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
> > > > > > > > > > be probed successfully. (See crb_acpi_add() and
> > > > > > > > > > tpm_crb_ffa_init() for reference.)
> > > > > > > > > >
> > > > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > > > > >
> > > > > > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > > > > > has already been initialized, since IMA initialization is performed
> > > > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > > > > > at the same level.
> > > > > > > > > >
> > > > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > > > > > log though TPM device presents in the system.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > > > > >
> > > > > > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > > > > > changes (e.g. ima_init_core).
> > > > > > > > >
> > > > > > > > > Please just limit the change to just calling ima_init() twice.
> > > > > > > >
> > > > > > > > My concern is that ima_update_policy_flags() will be called
> > > > > > > > when ima_init() is deferred -- not initialised anything.
> > > > > > > > though functionally, it might be okay however,
> > > > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > > > > > works logically.
> > > > > > > >
> > > > > > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > > > > > ima_init_core() with some error handling.
> > > > > > > >
> > > > > > > > Am I missing something?
> > > > > > >
> > > > > > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > > > > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > > > > >
> > > > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > > > > > it by caller of ima_init().
> > > > > >
> > > > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
> > > > > > instead of going into TPM-bypass mode, return immediately. There are no calls
> > > > > > to anything else. Just call ima_init() a second time.
> > > > >
> > > > > I’m not fully convinced this is sufficient.
> > > > >
> > > > > What I meant is the case where ima_init() fails due to other
> > > > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
> > > >
> > > > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
> > > > available at late_initcall. This would be classified as a bug fix and would be
> > > > backported. No other changes should be included in this patch.
> > >
> > > Okay.
> > >
> > > > >
> > > > > I’d also like to ask again whether it is fine to call
> > > > > ima_update_policy_flags() and keep the notifier registered in the
> > > > > deferred TPM case. While this may be functionally acceptable, it seems
> > > > > logically questionable to do so when ima_init() has not completed.
> > > >
> > > > Other than extending the TPM, IMA should behave exactly the same whether there
> > > > is a TPM or goes into TPM-bypass mode.
> > > >
> > > > >
> > > > > There is also a possibility that a deferred case ultimately fails (e.g.
> > > > > deferred at late_initcall, but then failing at late_initcall_sync
> > > > > for another reason, even while entering TPM bypass mode). In that case,
> > > > > it seems more appropriate to handle this state in the caller of
> > > > > ima_init(), rather than inside ima_init() itself.
> > > >
> > > > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
> > > > bypass mode. Please don't make any other changes to the existing IMA behavior
> > > > and hide it here behind the late_initcall_sync change.
> > >
> > > Okay. you're talking called ima_update_policy_flags() at late_initcall
> > > wouldn't be not a problem even in case of late_initcall_sync's ima_init()
> > > get failed with "TPM-bypass mode".
> > >
> > > I see then, I'll make a patch simpler then.
> >
> > But I think in case of below situation:
> > - late_initcall's first ima_init() is deferred.
> > - late_initcall_sync try again but failed and try again with
> > CONFIG_IMA_DEFAULT_HASH.
> >
> > I would like to sustain init_ima_core to reduce the same code repeat
> > in late_initcall_sync.
>
> I think what Mimi's proposing is:
>
> If we're in late_initcall, and the TPM isn't available, return
> immediately with an error (the EPROBE_DEFER?), don't do any init.
>
> If we're in late_initcall_sync, either we're already initialised, so do
> return and nothing, or run through the entire flow, even if the TPM
> isn't unavailable.
>
> So ima_init() just needs to know a) if it's in the sync or non-sync mode
> and b) for the sync mode, if we've already done the init at
> non-sync.
Thanks, Jonathan. That is exactly what I'm suggesting. Any other changes
should not be included in this patch. Since Yeoreum is not hearing me, feel
free to post a patch.
Mimi
^ permalink raw reply
* Re: [RFC PATCH v1 10/11] samples/landlock: Add capability and namespace restriction support
From: Mickaël Salaün @ 2026-04-23 13:51 UTC (permalink / raw)
To: Günther Noack
Cc: Christian Brauner, Günther Noack, Paul Moore,
Serge E . Hallyn, Justin Suess, Lennart Poettering,
Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260422.cd00ad04e709@gnoack.org>
On Wed, Apr 22, 2026 at 11:20:45PM +0200, Günther Noack wrote:
> On Thu, Mar 12, 2026 at 11:04:43AM +0100, Mickaël Salaün wrote:
> > Extend the sandboxer sample to demonstrate the new Landlock capability
> > and namespace restriction features. The LL_CAPS environment variable
> > takes a colon-delimited list of allowed capability numbers (e.g. "18"
> > for CAP_SYS_CHROOT). The LL_NS variable takes a colon-delimited list of
> > allowed namespace types by short name (e.g. "user:uts:net"). Update
> > LANDLOCK_ABI_LAST to 9 and add best-effort degradation for older
> > kernels.
> >
> > Allow creating user and UTS namespaces but deny network namespaces
> > (works as an unprivileged user). All capabilities are available
> > (LL_CAPS is not set), but namespace creation is still restricted to the
> > types listed in LL_NS. The first command succeeds because user and UTS
> > types are in the allowed set, and sets the hostname inside the new UTS
> > namespace. The second command fails because the network namespace type
> > is not allowed by the LANDLOCK_PERM_NAMESPACE_ENTER rule:
> >
> > LL_FS_RO=/ LL_FS_RW=/proc LL_NS="user:uts" \
> > ./sandboxer /bin/sh -c \
> > "unshare --user --uts --map-root-user hostname sandbox \
> > && ! unshare --user --net true"
> >
> > Allow only user namespace creation and CAP_SYS_CHROOT (18), denying all
> > other capabilities and namespace types (works as an unprivileged user).
> > An unprivileged process creates a user namespace (no capability
> > required) and calls chroot inside it using the CAP_SYS_CHROOT granted
> > within the new namespace:
> >
> > LL_FS_RO=/ LL_FS_RW="" LL_NS="user" LL_CAPS="18" \
> > ./sandboxer /bin/sh -c \
> > "unshare --user --keep-caps chroot / true"
> >
> > Cc: Christian Brauner <brauner@kernel.org>
> > Cc: Günther Noack <gnoack@google.com>
> > Cc: Paul Moore <paul@paul-moore.com>
> > Cc: Serge E. Hallyn <serge@hallyn.com>
> > Signed-off-by: Mickaël Salaün <mic@digikod.net>
> > ---
> > samples/landlock/sandboxer.c | 164 +++++++++++++++++++++++++++++++++--
> > 1 file changed, 155 insertions(+), 9 deletions(-)
> >
> > diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
> > index 9f21088c0855..09c499703835 100644
> > --- a/samples/landlock/sandboxer.c
> > +++ b/samples/landlock/sandboxer.c
> > @@ -14,6 +14,8 @@
> > #include <fcntl.h>
> > #include <linux/landlock.h>
> > #include <linux/socket.h>
> > +#include <sched.h>
> > +#include <stdbool.h>
> > #include <stddef.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> > @@ -22,12 +24,16 @@
> > #include <sys/stat.h>
> > #include <sys/syscall.h>
> > #include <unistd.h>
> > -#include <stdbool.h>
> >
> > #if defined(__GLIBC__)
> > #include <linux/prctl.h>
> > #endif
> >
> > +/* From include/linux/bits.h, not available in userspace. */
> > +#ifndef BITS_PER_TYPE
> > +#define BITS_PER_TYPE(type) (sizeof(type) * 8)
> > +#endif
> > +
> > #ifndef landlock_create_ruleset
> > static inline int
> > landlock_create_ruleset(const struct landlock_ruleset_attr *const attr,
> > @@ -60,6 +66,8 @@ static inline int landlock_restrict_self(const int ruleset_fd,
> > #define ENV_FS_RW_NAME "LL_FS_RW"
> > #define ENV_TCP_BIND_NAME "LL_TCP_BIND"
> > #define ENV_TCP_CONNECT_NAME "LL_TCP_CONNECT"
> > +#define ENV_CAPS_NAME "LL_CAPS"
> > +#define ENV_NS_NAME "LL_NS"
> > #define ENV_SCOPED_NAME "LL_SCOPED"
> > #define ENV_FORCE_LOG_NAME "LL_FORCE_LOG"
> > #define ENV_DELIMITER ":"
> > @@ -226,11 +234,125 @@ static int populate_ruleset_net(const char *const env_var, const int ruleset_fd,
> > return ret;
> > }
> >
> > +static __u64 str2ns(const char *const name)
> > +{
> > + static const struct {
> > + const char *name;
> > + __u64 value;
> > + } ns_map[] = {
> > + /* clang-format off */
> > + { "cgroup", CLONE_NEWCGROUP },
> > + { "ipc", CLONE_NEWIPC },
> > + { "mnt", CLONE_NEWNS },
> > + { "net", CLONE_NEWNET },
> > + { "pid", CLONE_NEWPID },
> > + { "time", CLONE_NEWTIME },
> > + { "user", CLONE_NEWUSER },
> > + { "uts", CLONE_NEWUTS },
> > + /* clang-format on */
> > + };
> > + size_t i;
> > +
> > + for (i = 0; i < sizeof(ns_map) / sizeof(ns_map[0]); i++) {
> > + if (strcmp(name, ns_map[i].name) == 0)
> > + return ns_map[i].value;
> > + }
> > + return 0;
> > +}
> > +
> > +static int populate_ruleset_caps(const char *const env_var,
> > + const int ruleset_fd)
> > +{
> > + int ret = 1;
> > + char *env_cap_name, *env_cap_name_next, *strcap;
> > + struct landlock_capability_attr cap_attr = {
> > + .allowed_perm = LANDLOCK_PERM_CAPABILITY_USE,
> > + };
> > +
> > + env_cap_name = getenv(env_var);
> > + if (!env_cap_name)
> > + return 0;
> > + env_cap_name = strdup(env_cap_name);
> > + unsetenv(env_var);
> > +
> > + env_cap_name_next = env_cap_name;
> > + while ((strcap = strsep(&env_cap_name_next, ENV_DELIMITER))) {
> > + __u64 cap;
> > +
> > + if (strcmp(strcap, "") == 0)
> > + continue;
> > +
> > + if (str2num(strcap, &cap) ||
>
> libcap has cap_from_name(3). I believe we are linking with libcap
> already to drop them before tests. (I have not used this function
> myself yet, but it sounds like it would address this case.)
libcap is only used for kselftests, not this sample, but yes, let's use
libcap here too.
>
>
> > + cap >= BITS_PER_TYPE(cap_attr.capabilities)) {
> > + fprintf(stderr,
> > + "Failed to parse capability at \"%s\"\n",
> > + strcap);
> > + goto out_free_name;
> > + }
> > + cap_attr.capabilities = 1ULL << cap;
> > + if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_CAPABILITY,
> > + &cap_attr, 0)) {
> > + fprintf(stderr,
> > + "Failed to update the ruleset with capability \"%llu\": %s\n",
> > + (unsigned long long)cap, strerror(errno));
> > + goto out_free_name;
> > + }
> > + }
> > + ret = 0;
> > +
> > +out_free_name:
> > + free(env_cap_name);
> > + return ret;
> > +}
> > +
> > +static int populate_ruleset_ns(const char *const env_var, const int ruleset_fd)
> > +{
> > + int ret = 1;
> > + char *env_ns_name, *env_ns_name_next, *strns;
> > + struct landlock_namespace_attr ns_attr = {
> > + .allowed_perm = LANDLOCK_PERM_NAMESPACE_ENTER,
> > + };
> > +
> > + env_ns_name = getenv(env_var);
> > + if (!env_ns_name)
> > + return 0;
> > + env_ns_name = strdup(env_ns_name);
> > + unsetenv(env_var);
> > +
> > + env_ns_name_next = env_ns_name;
> > + while ((strns = strsep(&env_ns_name_next, ENV_DELIMITER))) {
> > + __u64 ns_type;
> > +
> > + if (strcmp(strns, "") == 0)
> > + continue;
> > +
> > + ns_type = str2ns(strns);
> > + if (!ns_type) {
> > + fprintf(stderr, "Unknown namespace type \"%s\"\n",
> > + strns);
> > + goto out_free_name;
> > + }
> > + ns_attr.namespace_types = ns_type;
> > + if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NAMESPACE,
> > + &ns_attr, 0)) {
> > + fprintf(stderr,
> > + "Failed to update the ruleset with namespace \"%s\": %s\n",
> > + strns, strerror(errno));
> > + goto out_free_name;
> > + }
> > + }
> > + ret = 0;
> > +
> > +out_free_name:
> > + free(env_ns_name);
> > + return ret;
> > +}
> > +
> > /* Returns true on error, false otherwise. */
> > static bool check_ruleset_scope(const char *const env_var,
> > struct landlock_ruleset_attr *ruleset_attr)
> > {
> > - char *env_type_scope, *env_type_scope_next, *ipc_scoping_name;
> > + char *env_type_scope, *env_type_scope_next, *scope_name;
> > bool error = false;
> > bool abstract_scoping = false;
> > bool signal_scoping = false;
> > @@ -247,16 +369,14 @@ static bool check_ruleset_scope(const char *const env_var,
> >
> > env_type_scope = strdup(env_type_scope);
> > env_type_scope_next = env_type_scope;
> > - while ((ipc_scoping_name =
> > - strsep(&env_type_scope_next, ENV_DELIMITER))) {
> > - if (strcmp("a", ipc_scoping_name) == 0 && !abstract_scoping) {
> > + while ((scope_name = strsep(&env_type_scope_next, ENV_DELIMITER))) {
> > + if (strcmp("a", scope_name) == 0 && !abstract_scoping) {
> > abstract_scoping = true;
> > - } else if (strcmp("s", ipc_scoping_name) == 0 &&
> > - !signal_scoping) {
> > + } else if (strcmp("s", scope_name) == 0 && !signal_scoping) {
> > signal_scoping = true;
> > } else {
> > fprintf(stderr, "Unknown or duplicate scope \"%s\"\n",
> > - ipc_scoping_name);
> > + scope_name);
> > error = true;
> > goto out_free_name;
> > }
> > @@ -299,7 +419,7 @@ static bool check_ruleset_scope(const char *const env_var,
> >
> > /* clang-format on */
> >
> > -#define LANDLOCK_ABI_LAST 8
> > +#define LANDLOCK_ABI_LAST 9
> >
> > #define XSTR(s) #s
> > #define STR(s) XSTR(s)
> > @@ -322,6 +442,10 @@ static const char help[] =
> > "means an empty list):\n"
> > "* " ENV_TCP_BIND_NAME ": ports allowed to bind (server)\n"
> > "* " ENV_TCP_CONNECT_NAME ": ports allowed to connect (client)\n"
> > + "* " ENV_CAPS_NAME ": capability numbers allowed to use "
> > + "(e.g. 10 for CAP_NET_BIND_SERVICE, 21 for CAP_SYS_ADMIN)\n"
> > + "* " ENV_NS_NAME ": namespace types allowed to enter "
> > + "(cgroup, ipc, mnt, net, pid, time, user, uts)\n"
> > "* " ENV_SCOPED_NAME ": actions denied on the outside of the landlock domain\n"
> > " - \"a\" to restrict opening abstract unix sockets\n"
> > " - \"s\" to restrict sending signals\n"
> > @@ -334,6 +458,8 @@ static const char help[] =
> > ENV_FS_RW_NAME "=\"/dev/null:/dev/full:/dev/zero:/dev/pts:/tmp\" "
> > ENV_TCP_BIND_NAME "=\"9418\" "
> > ENV_TCP_CONNECT_NAME "=\"80:443\" "
> > + ENV_CAPS_NAME "=\"21\" "
> > + ENV_NS_NAME "=\"user:uts:net\" "
> > ENV_SCOPED_NAME "=\"a:s\" "
> > "%1$s bash -i\n"
> > "\n"
> > @@ -357,6 +483,8 @@ int main(const int argc, char *const argv[], char *const *const envp)
> > LANDLOCK_ACCESS_NET_CONNECT_TCP,
> > .scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> > LANDLOCK_SCOPE_SIGNAL,
> > + .handled_perm = LANDLOCK_PERM_CAPABILITY_USE |
> > + LANDLOCK_PERM_NAMESPACE_ENTER,
> > };
> > int supported_restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON;
> > int set_restrict_flags = 0;
> > @@ -438,6 +566,10 @@ int main(const int argc, char *const argv[], char *const *const envp)
> > ~LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON;
> > __attribute__((fallthrough));
> > case 7:
> > + __attribute__((fallthrough));
> > + case 8:
> > + /* Removes permission support for ABI < 9 */
> > + ruleset_attr.handled_perm = 0;
> > /* Must be printed for any ABI < LANDLOCK_ABI_LAST. */
> > fprintf(stderr,
> > "Hint: You should update the running kernel "
> > @@ -470,6 +602,14 @@ int main(const int argc, char *const argv[], char *const *const envp)
> > ~LANDLOCK_ACCESS_NET_CONNECT_TCP;
> > }
> >
> > + /* Removes capability handling if not set by a user. */
> > + if (!getenv(ENV_CAPS_NAME))
> > + ruleset_attr.handled_perm &= ~LANDLOCK_PERM_CAPABILITY_USE;
> > +
> > + /* Removes namespace handling if not set by a user. */
> > + if (!getenv(ENV_NS_NAME))
> > + ruleset_attr.handled_perm &= ~LANDLOCK_PERM_NAMESPACE_ENTER;
> > +
> > if (check_ruleset_scope(ENV_SCOPED_NAME, &ruleset_attr))
> > return 1;
> >
> > @@ -514,6 +654,12 @@ int main(const int argc, char *const argv[], char *const *const envp)
> > goto err_close_ruleset;
> > }
> >
> > + if (populate_ruleset_caps(ENV_CAPS_NAME, ruleset_fd))
> > + goto err_close_ruleset;
> > +
> > + if (populate_ruleset_ns(ENV_NS_NAME, ruleset_fd))
> > + goto err_close_ruleset;
> > +
> > if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
> > perror("Failed to restrict privileges");
> > goto err_close_ruleset;
> > --
> > 2.53.0
> >
>
^ permalink raw reply
* Re: [RFC PATCH v1 11/11] landlock: Add documentation for capability and namespace restrictions
From: Mickaël Salaün @ 2026-04-23 13:52 UTC (permalink / raw)
To: Günther Noack
Cc: Christian Brauner, Günther Noack, Paul Moore,
Serge E . Hallyn, Justin Suess, Lennart Poettering,
Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260422.5a7059c06fb0@gnoack.org>
On Wed, Apr 22, 2026 at 10:38:33PM +0200, Günther Noack wrote:
> Hello!
>
> On Thu, Mar 12, 2026 at 11:04:44AM +0100, Mickaël Salaün wrote:
> > Document the two new Landlock permission categories in the userspace
> > API guide, admin guide, and kernel security documentation.
> >
> > The userspace API guide adds sections on capability restriction
> > (LANDLOCK_PERM_CAPABILITY_USE with LANDLOCK_RULE_CAPABILITY), namespace
> > restriction (LANDLOCK_PERM_NAMESPACE_ENTER with LANDLOCK_RULE_NAMESPACE
> > covering creation via unshare/clone and entry via setns), and the
> > backward-compatible degradation pattern for ABI < 9. A table documents
> > the per-namespace-type capability requirements for both creation and
> > entry.
> >
> > The admin guide adds the new perm.namespace_enter and
> > perm.capability_use audit blocker names with their object identification
> > fields (namespace_type, namespace_inum, capability).
> >
> > The kernel security documentation adds a "Ruleset restriction models"
> > section defining the three models (handled_access_*, handled_perm,
> > scoped), their coverage and compatibility properties, and the criteria
> > for choosing between them for future features. It also documents
> > composability with user namespaces and adds kernel-doc references for
> > the new capability and namespace headers.
> >
> > Cc: Christian Brauner <brauner@kernel.org>
> > Cc: Günther Noack <gnoack@google.com>
> > Cc: Paul Moore <paul@paul-moore.com>
> > Cc: Serge E. Hallyn <serge@hallyn.com>
> > Signed-off-by: Mickaël Salaün <mic@digikod.net>
> > ---
> > Documentation/admin-guide/LSM/landlock.rst | 19 ++-
> > Documentation/security/landlock.rst | 80 ++++++++++-
> > Documentation/userspace-api/landlock.rst | 156 ++++++++++++++++++++-
> > 3 files changed, 245 insertions(+), 10 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/LSM/landlock.rst b/Documentation/admin-guide/LSM/landlock.rst
> > index 9923874e2156..99c6a599ce9e 100644
> > --- a/Documentation/admin-guide/LSM/landlock.rst
> > +++ b/Documentation/admin-guide/LSM/landlock.rst
> > @@ -6,7 +6,7 @@ Landlock: system-wide management
> > ================================
> >
> > :Author: Mickaël Salaün
> > -:Date: January 2026
> > +:Date: March 2026
> >
> > Landlock can leverage the audit framework to log events.
> >
> > @@ -59,14 +59,25 @@ AUDIT_LANDLOCK_ACCESS
> > - scope.abstract_unix_socket - Abstract UNIX socket connection denied
> > - scope.signal - Signal sending denied
> >
> > + **perm.*** - Permission restrictions (ABI 9+):
> > + - perm.namespace_enter - Namespace entry was denied (creation via
> > + :manpage:`unshare(2)` / :manpage:`clone(2)` or joining via
> > + :manpage:`setns(2)`);
> > + ``namespace_type`` indicates the type (hex CLONE_NEW* bitmask),
> > + ``namespace_inum`` identifies the target namespace for
> > + :manpage:`setns(2)` operations
> > + - perm.capability_use - Capability use was denied;
> > + ``capability`` indicates the capability number
> > +
> > Multiple blockers can appear in a single event (comma-separated) when
> > multiple access rights are missing. For example, creating a regular file
> > in a directory that lacks both ``make_reg`` and ``refer`` rights would show
> > ``blockers=fs.make_reg,fs.refer``.
> >
> > - The object identification fields (path, dev, ino for filesystem; opid,
> > - ocomm for signals) depend on the type of access being blocked and provide
> > - context about what resource was involved in the denial.
> > + The object identification fields depend on the type of access being blocked:
> > + ``path``, ``dev``, ``ino`` for filesystem; ``opid``, ``ocomm`` for signals;
> > + ``namespace_type`` and ``namespace_inum`` for namespace operations;
> > + ``capability`` for capability use.
> >
> >
> > AUDIT_LANDLOCK_DOMAIN
> > diff --git a/Documentation/security/landlock.rst b/Documentation/security/landlock.rst
> > index 3e4d4d04cfae..cd3d640ca5c9 100644
> > --- a/Documentation/security/landlock.rst
> > +++ b/Documentation/security/landlock.rst
> > @@ -7,7 +7,7 @@ Landlock LSM: kernel documentation
> > ==================================
> >
> > :Author: Mickaël Salaün
> > -:Date: September 2025
> > +:Date: March 2026
> >
> > Landlock's goal is to create scoped access-control (i.e. sandboxing). To
> > harden a whole system, this feature should be available to any process,
> > @@ -89,6 +89,72 @@ this is required to keep access controls consistent over the whole system, and
> > this avoids unattended bypasses through file descriptor passing (i.e. confused
> > deputy attack).
> >
> > +Composability with user namespaces
> > +----------------------------------
> > +
> > +Landlock domain-based scoping and the kernel's user namespace-based capability
> > +scoping enforce isolation over independent hierarchies. Landlock checks domain
> > +ancestry; the kernel's ``ns_capable()`` checks user namespace ancestry. These
> > +hierarchies are orthogonal: Landlock enforcement is deterministic with respect
> > +to its own configuration, regardless of namespace or capability state, and vice
> > +versa. This orthogonality is a design invariant that must hold for all new
> > +scoped features.
> > +
> > +Ruleset restriction models
> > +--------------------------
>
> I have to second Justin, it's a good idea to introduce this explanation.
>
> > +
> > +Landlock provides three restriction models, each with different coverage
> > +and compatibility properties.
> > +
> > +Access rights (``handled_access_*``)
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Access rights control **enumerated operations on kernel objects**
> > +identified by a rule key (a file hierarchy or a network port). Each
> > +``handled_access_*`` field declares a set of access rights that the
> > +ruleset restricts. Multiple access rights share a single rule type.
> > +Operations for which no access right exists yet remain uncontrolled;
> > +new rights are added incrementally across ABI versions.
> > +
> > +Permissions (``handled_perm``)
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Permissions control **broad operations enforced at single kernel
> > +chokepoints**, achieving complete deny-by-default coverage. Each
> > +``LANDLOCK_PERM_*`` flag maps to its own rule type. When a ruleset
> > +handles a permission, all instances of that operation are denied unless
> > +explicitly allowed by a rule. New kernel values (new ``CAP_*``
> > +capabilities, new ``CLONE_NEW*`` namespace types) are automatically
> > +denied without any Landlock update.
>
> I find the terminology of "chokepoints" and "gateways" in this and the
> header documentation a bit vague; you could argue that opening a file
> for reading is also a chokepoint/gateway for using read() later on;
> it's not immediately clear to me how that's delineated.
Yeah, I wanted to express something wider that a fine-grained access
right. Any alternative words that would fit better?
>
> In my mind, the handled_* groups of access rights are usually defined
> by the "namespace" of the objects they are protecting, more than
> anything else: handled_access_fs: file paths, handled_access_net:
> struct sockaddr (which we only expose as "port" for now).
>
> To play the devil's advocate, a possible alternative would have been
> to introduce:
>
> handled_access_ns with values LANDLOCK_ACCESS_NS_FOO_ENTER,
> LANDLOCK_ACCESS_NS_BAR_ENTER, etc. (and documenting somewhere that
> these are guaranteed to stay in sync; a static assert is enough to
> make sure they do).
That was actually one of my initial version, but I couldn't find any
meaning ful other access rights that would both be useful for the
sandboxing use case and worth the implementation. At the end I
concluded that we needed "ambiant" access rights for things that are not
really tied to existing kernel objects, and to be able to fully express
current and future properties, hence using non-Landlock UAPI
(capabilities, namespace types...). The handled_perm name was the less
ambiguous one I could find, which still make sense.
Another important property is that the permissions rules don't have
access rights, only *one* permission bit which could be removed. I
choose to keep it as a safeguard (for UAPI check) and to still be able
to add new ones for such rule if one day we really find a useful use
case. Anyway, it's basically free.
>
> handled_access_caps with values LANDLOCK_ACCESS_CAPS_USE_FOO,
> LANDLOCK_ACCESS_CAPS_USE_BAR, etc., also guaranteed to stay in sync.
Genuine question: what would be these FOO and BAR? I couldn't find
anything worth it. The idea is to have a simple interface. In fact,
initially I didn't have these suffixes (i.e. _USE, _ENTER), and they are
not really needed, but these are also safeguards in the case we would
need one, and the main motivation is to make the semantic clear to
users (and more consistent with other Landlock access rights).
>
> That way the blocked accesses would still be "operations", and we
> would not need to have rules for them because the "object" being
> protected are the processes within the Landlock domain, so to say.
I'm not sure to understand, but an (also) previous version was to just
put the capability (and namespace type) bits directly in the ruleset
struct. The issue with this approach is that it doesn't work well with
a deny-by-default enforcement, and this would not be extensible, and
this would not handle well compatibility (fields set to zero by
default).
>
> Arguably, the LANDLOCK_ACCESS_FS_MAKE_* rights already follow a
> similar pattern.
Hmm, I'm not following.
>
> To be clear, I am myself only 50% convinced whether the API would be
> better. The implementation would be easier (but that doesn't count
> much in comparison).
>
>
> > +Each permission flag names a single gateway operation whose control
> > +transitively covers an open-ended set of downstream operations: for
> > +example, exercising a capability enables privileged operations across
> > +many subsystems; entering a namespace enables gaining capabilities in a
> > +new context.
> > +
> > +Permission rules identify what to allow using constants defined by other
> > +kernel subsystems (``CAP_*``, ``CLONE_NEW*``). Unknown values are
> > +silently ignored because deny-by-default ensures they are denied anyway.
> > +In contrast, unknown ``LANDLOCK_PERM_*`` flags in ``handled_perm`` are
> > +rejected (``-EINVAL``), since Landlock owns that namespace.
>
> OK I played through the compatibility scenarios which puzzled me in my
> reply to the cover letter, for both namespaces and capabilities.
> Namespaces are OK, so I'm just including that for completeness and for
> comparison, but I think the capabilities might be tricky?
>
>
> Case A: Namespaces
>
> In the scenario where a caller restricts
> LANDLOCK_PERM_NAMESPACE_ENTER, but then adds a rule to allow a
> non-existent namespace number like 1<<63.
>
> Landlock ABI v9:
> * The rule is accepted and the unknown value for the namespace type
> silently ignored
> * It is not possible to enter the namespace because the namespace API
> doesn't exist for it. (But that's appropriate.)
Yes, the namespace would just be unknown to the kernel, Landlock doesn't
do anything here.
>
> Landlock ABI v_future (the namespace type 1<<63 exists now):
> * The rule continues to be accepted.
> * When trying to exercise the namespace type, it works.
It works because the kernel now know about this namespace. Again,
nothing related to Landlock specifically.
>
> It seems that this scenario works fine. In the earlier version,
> entering the namespace already doesn't work because the kernel doesn't
> have support for it.
>
>
> Case B: Capabilities
>
> Whne new capabilities are introduced, I see that people have used the
> pattern where these capabilities are split off from operations which
> were previously controlled by CAP_SYS_ADMIN. An example is commit
> a17b53c4a4b5 ("bpf, capability: Introduce CAP_BPF"), which states:
>
> Split BPF operations that are allowed under CAP_SYS_ADMIN into
> combination of CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN. For backward
> compatibility include them in CAP_SYS_ADMIN as well.
>
> (The same pattern was also used in the introduction of
> CAP_CHECKPOINT_RESTORE and CAP_PERFMON. CAP_AUDIT_READ is older and
> did it differently.)
The key point here (and the architectural limitation) is that a new
capability cannot completely replace an existing one. The original
capability check will remain forever.
>
> Let's say there is a frobnicate() syscall guarded by CAP_SYS_ADMIN. A
> future kernel introduces CAP_FOO and then checks for frobnicate() that
> either one of CAP_FOO or CAP_SYS_ADMIN are present.
>
> A caller creates a ruleset restricting capability use with Landlock,
> and adds a rule to allow CAP_FOO but not CAP_SYS_ADMIN (e.g.,
> ^CAP_SYS_ADMIN)
>
> Landlock ABI v9: (CAP_FOO doesn't exist)
> * The rule for CAP_FOO is accepted and the unknown value for the
> capability silently ignored.
> * The call to frobnicate() fails because the use of the capability is
> forbidden
>
> Landlock ABI v10: (CAP_FOO starts to exist)
> * The rule continues to be accepted
> * The call to frobnicate() **succeeds now**, because the new kernel guards
> the operation by either one of those capabilities.
>
>
> So... for capabilities, it seems to be slightly incompatible if users
> allow capabilities with a rule which are not known yet? The reason
> for that is the way how capabilities "fork off" from CAP_SYS_ADMIN.
The key point is that the compatibility is deferred to the other kernel
subsystems. User space need to know which capabilities (or namespace
types) are supported before using them. It's not a Landlock
compatibility issue.
>
> I mean, I can see that it's a pretty fringe scenario if users pass
> capabilities that don't exist yet, but it *is* strictly speaking an
> incompatibiliy. Should we check the range of the passed capabilities?
> Am I overlooking any downsides to this if we force users to stay
> between 0 and CAP_LAST_CAP?
Checking the range of known capabilities (or namespace types) could
break the same Landlock rules on different kernels even if targeting the
same Landlock ABI version, which would be much worse. I definitely
prefer to have idempotent/deterministic Landlock rules.
>
>
> > +
> > +Scopes (``scoped``)
> > +~~~~~~~~~~~~~~~~~~~~
> > +
> > +Scopes restrict **cross-domain interactions** categorically, without
> > +rules. Setting a scope flag (e.g. ``LANDLOCK_SCOPE_SIGNAL``) denies the
> > +operation to targets outside the Landlock domain or its children. Like
> > +permissions, scopes provide complete coverage of the controlled
> > +operation.
> > +
> > +When adding new Landlock features, new operations on existing rule types
> > +extend the corresponding ``handled_access_*`` field (e.g. a new
> > +filesystem operation extends ``handled_access_fs``). A new object
> > +category with multiple fine-grained operations would use a new
> > +``handled_access_*`` field. New rule types that control a single
> > +chokepoint operation use ``handled_perm``.
> > +
> > Tests
> > =====
> >
> > @@ -110,6 +176,18 @@ Filesystem
> > .. kernel-doc:: security/landlock/fs.h
> > :identifiers:
> >
> > +Namespace
> > +---------
> > +
> > +.. kernel-doc:: security/landlock/ns.h
> > + :identifiers:
> > +
> > +Capability
> > +----------
> > +
> > +.. kernel-doc:: security/landlock/cap.h
> > + :identifiers:
> > +
> > Process credential
> > ------------------
> >
> > diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
> > index 13134bccdd39..238d30a18162 100644
> > --- a/Documentation/userspace-api/landlock.rst
> > +++ b/Documentation/userspace-api/landlock.rst
> > @@ -8,7 +8,7 @@ Landlock: unprivileged access control
> > =====================================
> >
> > :Author: Mickaël Salaün
> > -:Date: January 2026
> > +:Date: March 2026
> >
> > The goal of Landlock is to enable restriction of ambient rights (e.g. global
> > filesystem or network access) for a set of processes. Because Landlock
> > @@ -33,7 +33,7 @@ A Landlock rule describes an action on an object which the process intends to
> > perform. A set of rules is aggregated in a ruleset, which can then restrict
> > the thread enforcing it, and its future children.
> >
> > -The two existing types of rules are:
> > +The existing types of rules are:
> >
> > Filesystem rules
> > For these rules, the object is a file hierarchy,
> > @@ -44,6 +44,14 @@ Network rules (since ABI v4)
> > For these rules, the object is a TCP port,
> > and the related actions are defined with `network access rights`.
> >
> > +Capability rules (since ABI v9)
> > + For these rules, the object is a set of Linux capabilities,
> > + and the related actions are defined with `permission flags`.
> > +
> > +Namespace rules (since ABI v9)
> > + For these rules, the object is a set of namespace types,
> > + and the related actions are defined with `permission flags`.
> > +
> > Defining and enforcing a security policy
> > ----------------------------------------
> >
> > @@ -84,6 +92,9 @@ to be explicit about the denied-by-default access rights.
> > .scoped =
> > LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> > LANDLOCK_SCOPE_SIGNAL,
> > + .handled_perm =
> > + LANDLOCK_PERM_CAPABILITY_USE |
> > + LANDLOCK_PERM_NAMESPACE_ENTER,
> > };
> >
> > Because we may not know which kernel version an application will be executed
> > @@ -127,6 +138,12 @@ version, and only use the available subset of access rights:
> > /* Removes LANDLOCK_SCOPE_* for ABI < 6 */
> > ruleset_attr.scoped &= ~(LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> > LANDLOCK_SCOPE_SIGNAL);
> > + __attribute__((fallthrough));
> > + case 6:
> > + case 7:
> > + case 8:
> > + /* Removes permission support for ABI < 9 */
> > + ruleset_attr.handled_perm = 0;
> > }
> >
> > This enables the creation of an inclusive ruleset that will contain our rules.
> > @@ -191,6 +208,42 @@ number for a specific action: HTTPS connections.
> > err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
> > &net_port, 0);
> >
> > +For capability access-control, we can add rules that allow specific
> > +capabilities. For instance, to allow ``CAP_SYS_CHROOT`` (so the sandboxed
> > +process can call :manpage:`chroot(2)` inside a user namespace):
> > +
> > +.. code-block:: c
> > +
> > + struct landlock_capability_attr cap_attr = {
> > + .allowed_perm = LANDLOCK_PERM_CAPABILITY_USE,
> > + .capabilities = (1ULL << CAP_SYS_CHROOT),
> > + };
> > +
> > + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_CAPABILITY,
> > + &cap_attr, 0);
> > +
> > +For namespace access-control, we can add rules that allow entering specific
> > +namespace types (creating them via :manpage:`unshare(2)` / :manpage:`clone(2)`
> > +or joining them via :manpage:`setns(2)`). For instance, to allow creating user
> > +namespaces (which grants all capabilities inside the new namespace):
> > +
> > +.. code-block:: c
> > +
> > + struct landlock_namespace_attr ns_attr = {
> > + .allowed_perm = LANDLOCK_PERM_NAMESPACE_ENTER,
> > + .namespace_types = CLONE_NEWUSER,
> > + };
> > +
> > + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NAMESPACE,
> > + &ns_attr, 0);
> > +
> > +Together, these two rules allow an unprivileged process to create a user
> > +namespace and call :manpage:`chroot(2)` inside it, while denying all other
> > +capabilities and namespace types. User namespace creation is the one operation
> > +that does not require ``CAP_SYS_ADMIN``, so no capability rule is needed for it.
> > +See `Capability and namespace restrictions`_ for details on capability
> > +requirements.
> > +
> > When passing a non-zero ``flags`` argument to ``landlock_restrict_self()``, a
> > similar backwards compatibility check is needed for the restrict flags
> > (see sys_landlock_restrict_self() documentation for available flags):
> > @@ -354,10 +407,87 @@ The operations which can be scoped are:
> > A :manpage:`sendto(2)` on a socket which was previously connected will not
> > be restricted. This works for both datagram and stream sockets.
> >
> > -IPC scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
> > +Scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
> > If an operation is scoped within a domain, no rules can be added to allow access
> > to resources or processes outside of the scope.
> >
> > +Capability and namespace restrictions
> > +-------------------------------------
> > +
> > +See Documentation/security/landlock.rst for the design rationale behind
> > +the permission model (``handled_perm``) and how it differs from access
> > +rights (``handled_access_*``) and scopes (``scoped``).
> > +When a process creates a user namespace, the kernel grants all capabilities
> > +within that namespace. While these capabilities cannot directly bypass Landlock
> > +restrictions (Landlock enforces access controls independently of capability
> > +checks), they open kernel code paths that are normally unreachable to
> > +unprivileged users and may contain exploitable bugs.
> > +
> > +Landlock provides two complementary permissions to address this.
> > +``LANDLOCK_PERM_CAPABILITY_USE`` restricts which capabilities a process can use,
> > +even when it holds them. ``LANDLOCK_PERM_NAMESPACE_ENTER`` restricts which
> > +namespace types a process can create (via :manpage:`unshare(2)` or
> > +:manpage:`clone(2)`) or join (via :manpage:`setns(2)`). After creating a user
> > +namespace, the granted capabilities are scoped to namespaces owned by that user
> > +namespace or its descendants; to exercise a capability such as
> > +``CAP_NET_ADMIN``, the process must create a namespace of the corresponding type
> > +(e.g., a network namespace). Configuring both permissions together provides
> > +full coverage: ``LANDLOCK_PERM_CAPABILITY_USE`` restricts which capabilities are
> > +available, while ``LANDLOCK_PERM_NAMESPACE_ENTER`` restricts the namespaces in
> > +which they can be used.
> > +
> > +When a Landlock domain handles ``LANDLOCK_PERM_CAPABILITY_USE``, all Linux
> > +:manpage:`capabilities(7)` are denied by default unless a rule explicitly allows
> > +them. This is purely restrictive: Landlock can only deny capabilities that the
> > +traditional capability mechanism would have allowed, never grant additional ones.
> > +Rules are added with ``LANDLOCK_RULE_CAPABILITY`` using a
> > +&struct landlock_capability_attr. Each rule specifies a set of ``CAP_*`` values
> > +(as a bitmask) to allow. Capabilities above ``CAP_LAST_CAP`` are silently
> > +accepted but have no effect since the kernel never checks them; this means new
> > +capabilities introduced by future kernels are automatically denied.
>
> (See example above.)
>
>
> > +
> > +When a Landlock domain handles ``LANDLOCK_PERM_NAMESPACE_ENTER``, namespace
> > +creation and entry are denied by default unless a rule explicitly allows them.
> > +Rules are added with ``LANDLOCK_RULE_NAMESPACE`` using a
> > +&struct landlock_namespace_attr. Each rule specifies a set of ``CLONE_NEW*``
> > +flags to allow.
> > +
> > +In practice, unprivileged processes first create a user namespace (which requires
> > +no capability and grants all capabilities within it), then use those capabilities
> > +to create other namespace types. All non-user namespace types require
> > +``CAP_SYS_ADMIN`` for both creation and :manpage:`setns(2)` entry; mount
> > +namespace entry additionally requires ``CAP_SYS_CHROOT``. For
> > +:manpage:`setns(2)`, capabilities are checked relative to the target namespace,
> > +so a process in an ancestor user namespace naturally satisfies them; this
> > +includes joining user namespaces, which requires ``CAP_SYS_ADMIN``. When
> > +``LANDLOCK_PERM_CAPABILITY_USE`` is also handled, each of these capabilities
> > +must be explicitly allowed by a rule.
> > +
> > +When combining ``CLONE_NEWUSER`` with other ``CLONE_NEW*`` flags in a single
> > +:manpage:`unshare(2)` call, the ``CAP_SYS_ADMIN`` check targets the newly
> > +created user namespace, which is handled by ``LANDLOCK_PERM_NAMESPACE_ENTER``
> > +independently from ``LANDLOCK_PERM_CAPABILITY_USE``. Performing the user
> > +namespace creation and the additional namespace creation in two separate
> > +:manpage:`unshare(2)` calls requires a rule allowing ``CAP_SYS_ADMIN`` if the
> > +domain also handles ``LANDLOCK_PERM_CAPABILITY_USE``.
> > +
> > +More generally, Landlock domains and user namespaces form independent
> > +hierarchies: Landlock domains restrict what actions are allowed (each stacked
> > +layer narrows the permitted set), while user namespaces restrict where
> > +capabilities take effect (only within the process's own namespace and its
> > +descendants). Landlock access controls are fully determined by the domain
> > +configuration, regardless of the process's position in the user namespace
> > +hierarchy. When creating child user namespaces, it is recommended to also
> > +create a dedicated Landlock domain with restrictions relevant to each namespace
> > +context.
> > +
> > +Note that ``LANDLOCK_PERM_CAPABILITY_USE`` restricts the *use* of capabilities,
> > +not their presence in the process's credential. Capability sets can change
> > +after a domain is enforced through user namespace entry, :manpage:`execve(2)` of
> > +binaries with file capabilities, or :manpage:`capset(2)`. In all cases,
> > +:manpage:`capget(2)` will report the credential's capability sets, but any
> > +denied capability will fail with ``EPERM`` when exercised.
> > +
> > Truncating files
> > ----------------
> >
> > @@ -515,7 +645,7 @@ Access rights
> > -------------
> >
> > .. kernel-doc:: include/uapi/linux/landlock.h
> > - :identifiers: fs_access net_access scope
> > + :identifiers: fs_access net_access scope perm
> >
> > Creating a new ruleset
> > ----------------------
> > @@ -534,7 +664,8 @@ Extending a ruleset
> >
> > .. kernel-doc:: include/uapi/linux/landlock.h
> > :identifiers: landlock_rule_type landlock_path_beneath_attr
> > - landlock_net_port_attr
> > + landlock_net_port_attr landlock_capability_attr
> > + landlock_namespace_attr
> >
> > Enforcing a ruleset
> > -------------------
> > @@ -685,6 +816,21 @@ enforce Landlock rulesets across all threads of the calling process
> > using the ``LANDLOCK_RESTRICT_SELF_TSYNC`` flag passed to
> > sys_landlock_restrict_self().
> >
> > +Capability restriction (ABI < 9)
> > +--------------------------------
> > +
> > +Starting with the Landlock ABI version 9, it is possible to restrict
> > +:manpage:`capabilities(7)` with the new ``LANDLOCK_PERM_CAPABILITY_USE``
> > +permission flag and ``LANDLOCK_RULE_CAPABILITY`` rule type.
> > +
> > +Namespace restriction (ABI < 9)
> > +-------------------------------
> > +
> > +Starting with the Landlock ABI version 9, it is possible to restrict
> > +namespace creation (:manpage:`unshare(2)`, :manpage:`clone(2)`) and entry
> > +(:manpage:`setns(2)`) with the new ``LANDLOCK_PERM_NAMESPACE_ENTER`` permission
> > +flag and ``LANDLOCK_RULE_NAMESPACE`` rule type.
> > +
> > .. _kernel_support:
> >
> > Kernel support
> > --
> > 2.53.0
> >
>
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Yeoreum Yun @ 2026-04-23 13:55 UTC (permalink / raw)
To: Mimi Zohar
Cc: Jonathan McDowell, linux-security-module, linux-kernel,
linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, serge,
roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, jgg,
sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, noodles, sebastianene
In-Reply-To: <bd908e28298d968740d03c97bc7e441de188b7b4.camel@linux.ibm.com>
> On Thu, 2026-04-23 at 13:53 +0100, Jonathan McDowell wrote:
> > On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
> > > > > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > > > > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > > > > > > Hi Mimi,
> > > > > > > > >
> > > > > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > > > > > > the TPM driver must be built as built-in and
> > > > > > > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > > > > > >
> > > > > > > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > > > > > >
> > > > > > > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > > > > > > the CRB interface is probed before IMA initialization,
> > > > > > > > > > > the following conditions must be met:
> > > > > > > > > > >
> > > > > > > > > > > 1. The corresponding ffa_device must be registered,
> > > > > > > > > > > which is done via ffa_init().
> > > > > > > > > > >
> > > > > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
> > > > > > > > > > > tpm_crb_ffa_init().
> > > > > > > > > > >
> > > > > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
> > > > > > > > > > > be probed successfully. (See crb_acpi_add() and
> > > > > > > > > > > tpm_crb_ffa_init() for reference.)
> > > > > > > > > > >
> > > > > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > > > > > >
> > > > > > > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > > > > > > has already been initialized, since IMA initialization is performed
> > > > > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > > > > > > at the same level.
> > > > > > > > > > >
> > > > > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > > > > > > log though TPM device presents in the system.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > > > > > >
> > > > > > > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > > > > > > changes (e.g. ima_init_core).
> > > > > > > > > >
> > > > > > > > > > Please just limit the change to just calling ima_init() twice.
> > > > > > > > >
> > > > > > > > > My concern is that ima_update_policy_flags() will be called
> > > > > > > > > when ima_init() is deferred -- not initialised anything.
> > > > > > > > > though functionally, it might be okay however,
> > > > > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > > > > > > works logically.
> > > > > > > > >
> > > > > > > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > > > > > > ima_init_core() with some error handling.
> > > > > > > > >
> > > > > > > > > Am I missing something?
> > > > > > > >
> > > > > > > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > > > > > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > > > > > >
> > > > > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > > > > > > it by caller of ima_init().
> > > > > > >
> > > > > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
> > > > > > > instead of going into TPM-bypass mode, return immediately. There are no calls
> > > > > > > to anything else. Just call ima_init() a second time.
> > > > > >
> > > > > > I’m not fully convinced this is sufficient.
> > > > > >
> > > > > > What I meant is the case where ima_init() fails due to other
> > > > > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
> > > > >
> > > > > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
> > > > > available at late_initcall. This would be classified as a bug fix and would be
> > > > > backported. No other changes should be included in this patch.
> > > >
> > > > Okay.
> > > >
> > > > > >
> > > > > > I’d also like to ask again whether it is fine to call
> > > > > > ima_update_policy_flags() and keep the notifier registered in the
> > > > > > deferred TPM case. While this may be functionally acceptable, it seems
> > > > > > logically questionable to do so when ima_init() has not completed.
> > > > >
> > > > > Other than extending the TPM, IMA should behave exactly the same whether there
> > > > > is a TPM or goes into TPM-bypass mode.
> > > > >
> > > > > >
> > > > > > There is also a possibility that a deferred case ultimately fails (e.g.
> > > > > > deferred at late_initcall, but then failing at late_initcall_sync
> > > > > > for another reason, even while entering TPM bypass mode). In that case,
> > > > > > it seems more appropriate to handle this state in the caller of
> > > > > > ima_init(), rather than inside ima_init() itself.
> > > > >
> > > > > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
> > > > > bypass mode. Please don't make any other changes to the existing IMA behavior
> > > > > and hide it here behind the late_initcall_sync change.
> > > >
> > > > Okay. you're talking called ima_update_policy_flags() at late_initcall
> > > > wouldn't be not a problem even in case of late_initcall_sync's ima_init()
> > > > get failed with "TPM-bypass mode".
> > > >
> > > > I see then, I'll make a patch simpler then.
> > >
> > > But I think in case of below situation:
> > > - late_initcall's first ima_init() is deferred.
> > > - late_initcall_sync try again but failed and try again with
> > > CONFIG_IMA_DEFAULT_HASH.
> > >
> > > I would like to sustain init_ima_core to reduce the same code repeat
> > > in late_initcall_sync.
> >
> > I think what Mimi's proposing is:
> >
> > If we're in late_initcall, and the TPM isn't available, return
> > immediately with an error (the EPROBE_DEFER?), don't do any init.
> >
> > If we're in late_initcall_sync, either we're already initialised, so do
> > return and nothing, or run through the entire flow, even if the TPM
> > isn't unavailable.
> >
> > So ima_init() just needs to know a) if it's in the sync or non-sync mode
> > and b) for the sync mode, if we've already done the init at
> > non-sync.
>
> Thanks, Jonathan. That is exactly what I'm suggesting. Any other changes
> should not be included in this patch. Since Yeoreum is not hearing me, feel
> free to post a patch.
I see. so what you need to is this only
If it looks good to you. I'll send it at v3.
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index d48bf0ad26f4..88fe105b7f00 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -166,6 +166,7 @@ enum lsm_order {
* @initcall_fs: LSM callback for fs_initcall setup, optional
* @initcall_device: LSM callback for device_initcall() setup, optional
* @initcall_late: LSM callback for late_initcall() setup, optional
+ * @initcall_late_sync: LSM callback for late_initcall_sync() setup, optional
*/
struct lsm_info {
const struct lsm_id *id;
@@ -181,6 +182,7 @@ struct lsm_info {
int (*initcall_fs)(void);
int (*initcall_device)(void);
int (*initcall_late)(void);
+ int (*initcall_late_sync)(void);
};
#define DEFINE_LSM(lsm) \
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index a2f34f2d8ad7..334fa8927c45 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -118,10 +118,22 @@ void __init ima_load_x509(void)
int __init ima_init(void)
{
int rc;
+ static bool deferred = false;
+ static bool initialised = false;
+
+ if (initialised)
+ return 0;
ima_tpm_chip = tpm_default_chip();
- if (!ima_tpm_chip)
+ if (!ima_tpm_chip) {
+ if (!deferred) {
+ pr_info("Defer initialisation to the late_initcall_sync stage.\n");
+ deferred = true;
+ return 0;
+ }
+
pr_info("No TPM chip found, activating TPM-bypass!\n");
+ }
rc = integrity_init_keyring(INTEGRITY_KEYRING_IMA);
if (rc)
@@ -158,5 +170,7 @@ int __init ima_init(void)
UTS_RELEASE, strlen(UTS_RELEASE), false,
NULL, 0);
+ initialised = true;
+
return rc;
}
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 1d6229b156fb..847ec74a183d 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -1274,6 +1274,11 @@ static int __init init_ima(void)
return error;
}
+static int __init late_init_ima(void)
+{
+ return ima_init();
+}
+
static struct security_hook_list ima_hooks[] __ro_after_init = {
LSM_HOOK_INIT(bprm_check_security, ima_bprm_check),
LSM_HOOK_INIT(bprm_creds_for_exec, ima_bprm_creds_for_exec),
@@ -1321,4 +1326,6 @@ DEFINE_LSM(ima) = {
.blobs = &ima_blob_sizes,
/* Start IMA after the TPM is available */
.initcall_late = init_ima,
+ /* Start IMA late in case of probing TPM is deferred. */
+ .initcall_late_sync = late_init_ima,
};
diff --git a/security/lsm_init.c b/security/lsm_init.c
index 573e2a7250c4..4e5c59beb82a 100644
--- a/security/lsm_init.c
+++ b/security/lsm_init.c
@@ -547,13 +547,22 @@ device_initcall(security_initcall_device);
* security_initcall_late - Run the LSM late initcalls
*/
static int __init security_initcall_late(void)
+{
+ return lsm_initcall(late);
+}
+late_initcall(security_initcall_late);
+
+/**
+ * security_initcall_late_sync - Run the LSM late initcalls sync
+ */
+static int __init security_initcall_late_sync(void)
{
int rc;
- rc = lsm_initcall(late);
+ rc = lsm_initcall(late_sync);
lsm_pr_dbg("all enabled LSMs fully activated\n");
call_blocking_lsm_notifier(LSM_STARTED_ALL, NULL);
return rc;
}
-late_initcall(security_initcall_late);
+late_initcall_sync(security_initcall_late_sync);
...skipping...
+
pr_info("No TPM chip found, activating TPM-bypass!\n");
+ }
rc = integrity_init_keyring(INTEGRITY_KEYRING_IMA);
if (rc)
@@ -158,5 +170,7 @@ int __init ima_init(void)
UTS_RELEASE, strlen(UTS_RELEASE), false,
NULL, 0);
+ initialised = true;
+
return rc;
}
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 1d6229b156fb..847ec74a183d 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -1274,6 +1274,11 @@ static int __init init_ima(void)
return error;
}
+static int __init late_init_ima(void)
+{
+ return ima_init();
+}
+
static struct security_hook_list ima_hooks[] __ro_after_init = {
LSM_HOOK_INIT(bprm_check_security, ima_bprm_check),
LSM_HOOK_INIT(bprm_creds_for_exec, ima_bprm_creds_for_exec),
@@ -1321,4 +1326,6 @@ DEFINE_LSM(ima) = {
.blobs = &ima_blob_sizes,
/* Start IMA after the TPM is available */
.initcall_late = init_ima,
+ /* Start IMA late in case of probing TPM is deferred. */
+ .initcall_late_sync = late_init_ima,
};
diff --git a/security/lsm_init.c b/security/lsm_init.c
index 573e2a7250c4..4e5c59beb82a 100644
--- a/security/lsm_init.c
+++ b/security/lsm_init.c
@@ -547,13 +547,22 @@ device_initcall(security_initcall_device);
* security_initcall_late - Run the LSM late initcalls
*/
static int __init security_initcall_late(void)
+{
+ return lsm_initcall(late);
+}
+late_initcall(security_initcall_late);
+
+/**
+ * security_initcall_late_sync - Run the LSM late initcalls sync
+ */
+static int __init security_initcall_late_sync(void)
{
int rc;
- rc = lsm_initcall(late);
+ rc = lsm_initcall(late_sync);
lsm_pr_dbg("all enabled LSMs fully activated\n");
call_blocking_lsm_notifier(LSM_STARTED_ALL, NULL);
return rc;
}
-late_initcall(security_initcall_late);
+late_initcall_sync(security_initcall_late_sync);
--
Sincerely,
Yeoreum Yun
^ permalink raw reply related
* Re: [RFC PATCH v1 00/11] Landlock: Namespace and capability control
From: Mickaël Salaün @ 2026-04-23 13:50 UTC (permalink / raw)
To: Günther Noack
Cc: Christian Brauner, Günther Noack, Paul Moore,
Serge E . Hallyn, Justin Suess, Lennart Poettering,
Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
kernel-team, linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260422.c1e2cbee5589@gnoack.org>
On Wed, Apr 22, 2026 at 11:16:59PM +0200, Günther Noack wrote:
> On Tue, Apr 21, 2026 at 10:24:00AM +0200, Mickaël Salaün wrote:
> > On Mon, Apr 20, 2026 at 05:06:32PM +0200, Günther Noack wrote:
> > > Hello!
> > >
> > > On Thu, Mar 12, 2026 at 11:04:33AM +0100, Mickaël Salaün wrote:
> > > > Namespaces are a fundamental building block for containers and
> > > > application sandboxes, but user namespace creation significantly widens
> > > > the kernel attack surface. CVE-2022-0185 (filesystem mount parsing),
> > > > CVE-2022-25636 and CVE-2023-32233 (netfilter), and CVE-2022-0492 (cgroup
> > > > v1 release_agent) all demonstrate vulnerabilities exploitable only
> > > > through capabilities gained via user namespaces. Some distributions
> > > > block user namespace creation entirely, but this removes a useful
> > > > isolation primitive. Fine-grained control allows trusted programs to
> > > > use namespaces while preventing unnecessary exposure for programs that
> > > > do not need them.
> > > >
> > > > Existing mechanisms (user.max_*_namespaces sysctls, userns_create LSM
> > > > hook, PR_SET_NO_NEW_PRIVS, and capset) each address part of this threat
> > > > but none provides per-process, fine-grained control over both namespace
> > > > types and capabilities. Container runtimes resort to seccomp-based
> > > > clone/unshare filtering, but seccomp cannot dereference clone3's flag
> > > > structure, forcing runtimes to block clone3 entirely.
> > > >
> > > > Landlock's composable layer model enables several patterns: a user
> > > > session manager can restrict namespace types and capabilities broadly
> > > > while allowing trusted programs to create the namespaces they need, and
> > > > each deeper layer can further restrict the allowed set. Container
> > > > runtimes can similarly deny namespace creation inside managed
> > > > containers.
> > >
> > > I assume we are talking about an unrestricted systemd user session
> > > manager, which would not itself be restricted? (If the entire user
> > > session were running under Landlock, users couldn't change their
> > > passwords with "passwd" any more, because of the no_new_privs
> > > requirement.)
> >
> > systemd can be use to create such session, as other init systems.
> > If no_new_privs is set, commands such as passwd would indeed not work,
> > but:
> > 1. The process applying the Landlock restrictions (e.g. creating the
> > user session) doesn't need to set no_new_privs if it has
> > CAP_SYS_ADMIN in the current user namespace.
> > 2. SUID programs can (and should probably) be replaced with proper
> > client/server interfaces (i.e. for the client to not be privileged),
> > see DBus services (e.g. Account) or homectl for instance.
>
> I also think services are a better approach than the suid bit, but
> that's to my knowledge not the state of affairs yet (until Lennart
> makes it happen, hint hint ;-)).
>
>
> > > > This series adds two new permission categories to Landlock:
> > > >
> > > > - LANDLOCK_PERM_NAMESPACE_ENTER: Restricts which namespace types a
> > > > sandboxed process can acquire: both creation (unshare/clone) and entry
> > > > (setns). User namespace creation has no capability check in the
> > > > kernel, so this is the only enforcement mechanism for that entry
> > > > point.
> > > >
> > > > - LANDLOCK_PERM_CAPABILITY_USE: Restricts which Linux capabilities a
> > > > sandboxed process can use, regardless of how they were obtained
> > > > (including through user namespace creation).
> > >
> > > Given that you already went through multiple iterations here, I fully
> >
> > It's the first public one, but it's well advanced.
> >
> > > expect that I am overlooking something here, but based on the
> > > explanation, it's not clear to me why the capability control is needed
> > > in addition to the namespace control, to reduce the kernel attack
> > > surface.
> > >
> > > In my understanding the "attack surface" problem with user namespaces
> > > is that they allow unprivileged processes to gain CAP_SYS_ADMIN within
> > > that namespace, which unlocks access to code paths which were
> > > traditionally reserved for the (top level) root user.
> >
> > This capability and others.
> >
> > >
> > > But then, to prevent that from happening, it seems that restricting
> > > access to user namespace creation would be sufficient?
> >
> > It would be sufficient to limit the kernel attack surface, but it would
> > make all the related features unusable. As explained in this cover
> > letter, there are already several ways to block everything, but this
> > doesn't help for a lot of use cases and this Landlock feature proposes a
> > new fine-grained and unprivileged way to properly restrict some
> > capabilities.
> >
> > >
> > > (Also, in some cases, I suspect it might be possible to break
> > > assumptions that more privileged processes make about filesystem
> > > layout if the user can change the mount layout. But that is not an
> > > issue with Landlock, as we forbid changes to mounts and also require
> > > no_new_privs.)
> > >
> > >
> > > > Both use new handled_perm and LANDLOCK_RULE_* constants following the
> > > > existing allow-list model. The UAPI uses raw CAP_* and CLONE_NEW*
> > > > values directly; unknown values are silently accepted for forward
> > > > compatibility (the allow-list denies them by default). The Landlock ABI
> > > > version is bumped from 8 to 9.
> > >
> > > Compatibility question:
> > >
> > > For both permission categories, when they are "handled" in the
> > > ruleset, they default to denying *all* types of namespaces, and *all*
> > > types of capabilities.
> > >
> > > This is different to the handled_access_* rights, where we are
> > > requiring users to explicitly list all restricted rights as "handled",
> > > because the full list of available operations might be a moving
> > > target.
> > >
> > > Why is this not a problem for capabilities and for namespaces? Both
> > > the list of capabilities and the list of namespaces has been expanded
> > > in the past. What happens if a new capability or namespace is
> > > invented? If these are evolved, is that backwards compatible for the
> > > existing users of these Landlock permission categories?
> >
> > This question is answered is the documentation (and the commit
> > messages), and that's the main difference between handled_access_* and
> > handled_perm. In a nutshell, the permission rules uses non-Landlock
> > bits that naturally evolve without any Landlock-specific changes.
>
> I think the deny-by-default is fine given that these namespaces and
> capabilities do not exist yet. It is the case where users add a rule
> and we silently ignore unknown bits in the bitfield, which I think
> introduces a small problem. I responded to the documentation commit
> with what I believe is a counterexample for the capabilities case.
> (Let's discuss it on the documentation patch in the context of the
> examples.)
>
>
> > > > The handled_perm infrastructure is designed to be reusable by future
> > > > permission categories. The last patch documents the design rationale
> > > > for the permission model and the criteria for choosing between
> > > > handled_access_*, handled_perm, and scoped. A patch series to add
> > > > socket creation control is under review [2]; it could benefit from the
> > > > same permission model to achieve complete deny-by-default coverage of
> > > > socket creation.
> >
> > See here ^
> >
> > > >
> > > > This series builds on Christian Brauner's namespace LSM blob RFC [1],
> > > > included as patch 1.
> > > >
> > > > Christian, could you please review patch 3? It adds a FOR_EACH_NS_TYPE
> > > > X-macro to ns_common_types.h and derives CLONE_NS_ALL, replacing inline
> > > > CLONE_NEW* flag enumerations in nsproxy.c and fork.c.
> > > >
> > > > Paul, could you please review patch 2? It adds LSM_AUDIT_DATA_NS, a new
> > > > audit record type that logs namespace_type and inum for
> > > > namespace-related LSM denials.
> > > >
> > > > All four example vulnerabilities follow the same pattern: an
> > > > unprivileged user creates a user namespace to obtain capabilities, then
> > > > creates a second namespace to exercise them against vulnerable code.
> > > > LANDLOCK_PERM_NAMESPACE_ENTER prevents this by denying the user
> > > > namespace (eliminating the capability grant) or the specific namespace
> > > > type needed to exercise it. LANDLOCK_PERM_CAPABILITY_USE independently
> > > > prevents it by denying the required capability.
> > >
> > > Here, it is also not clear to me why LANDLOCK_PERM_CAPABILITY_USE is
> > > needed in addition to LANDLOCK_PERM_NAMESPACE_ENTER.
> >
> > This is also explained in the documentation.
>
> > > Looking at capabilities(7), my understanding is that capabilities can
> > > only be acquired through:
> > >
> > > (1) user namespaces (prevented with LANDLOCK_PERM_NAMESPACE_ENTER)
> > > (2) execve (setuid or individual capabilities, prevented using
> > > PR_SET_NO_NEW_PRIVS)
> > >
> > > ...so if a process were to start out with no such capabilities,
> > > wouldn't that be enough to prevent it from gaining more? Am I
> > > overlooking another way through which these can be acquired?
> > >
> > > The Landlock capability support adds a "filter" for the use of
> > > capabilities, but my understanding of the capability system was that
> > > it already *is* that filter. As long as we prevent the acquisition of
> > > new capabilities, shouldn't that be sufficient?
> >
> > In a nutshell, capabilities applies to namespaces (and their type), so
> > it makes sense to be able to control them together, see the chroot
> > example. Please take a look at the documentation.
>
> I had a hard time puzzling it together in the documentation, but the
> chroot example helped.
>
> So, if I am understanding correctly, the idea is that you need it in
> order to create a new user namespace,
The user namespace is the only namespace that doesn't require a
capability, but all others need at least one (which can be gained with a
user namespace).
> but the restrict the use of
> capabilities within that user namespace (not only CAP_SYS_ADMIN, but
> also more individual ones). Sounds reasonable.
That enables us to restrict the use of capabilities (within a user
namespace or not), and then, because capabilities applies to a namespace
hierarchy, to restrict some operations too. The limitation lies in the
split of capabilities (e.g. CAP_SYS_ADMIN is needed to create most
namespaces), but we cannot do anything about that (because of
compatibility). We can only hope that new capabilities will be
introduced to improve the situation.
>
> I can also see that in order to do that without the Landlock
> capability support, the first process within the new namespace would
> immediately need to drop capabilities, and that may be outside of the
> control of the person defining the Landlock policy..?
Right, dropping capabilities doesn't make sense for a (sandboxed)
attacker able to create a user namespace.
^ permalink raw reply
* Re: [RFC PATCH v1 11/11] landlock: Add documentation for capability and namespace restrictions
From: Mickaël Salaün @ 2026-04-23 13:51 UTC (permalink / raw)
To: Justin Suess
Cc: Christian Brauner, Günther Noack, Paul Moore,
Serge E . Hallyn, Lennart Poettering, Mikhail Ivanov,
Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang, kernel-team,
linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <abLSSuDUs22U1yzm@suesslenovo>
On Thu, Mar 12, 2026 at 10:48:42AM -0400, Justin Suess wrote:
> On Thu, Mar 12, 2026 at 11:04:44AM +0100, Mickaël Salaün wrote:
> > Document the two new Landlock permission categories in the userspace
> > API guide, admin guide, and kernel security documentation.
> >
> > The userspace API guide adds sections on capability restriction
> > (LANDLOCK_PERM_CAPABILITY_USE with LANDLOCK_RULE_CAPABILITY), namespace
> > restriction (LANDLOCK_PERM_NAMESPACE_ENTER with LANDLOCK_RULE_NAMESPACE
> > covering creation via unshare/clone and entry via setns), and the
> > backward-compatible degradation pattern for ABI < 9. A table documents
> > the per-namespace-type capability requirements for both creation and
> > entry.
> >
> > The admin guide adds the new perm.namespace_enter and
> > perm.capability_use audit blocker names with their object identification
> > fields (namespace_type, namespace_inum, capability).
> >
> > The kernel security documentation adds a "Ruleset restriction models"
> > section defining the three models (handled_access_*, handled_perm,
> > scoped), their coverage and compatibility properties, and the criteria
> > for choosing between them for future features. It also documents
> > composability with user namespaces and adds kernel-doc references for
> > the new capability and namespace headers.
> >
> > Cc: Christian Brauner <brauner@kernel.org>
> > Cc: Günther Noack <gnoack@google.com>
> > Cc: Paul Moore <paul@paul-moore.com>
> > Cc: Serge E. Hallyn <serge@hallyn.com>
> > Signed-off-by: Mickaël Salaün <mic@digikod.net>
> > ---
> > Documentation/admin-guide/LSM/landlock.rst | 19 ++-
> > Documentation/security/landlock.rst | 80 ++++++++++-
> > Documentation/userspace-api/landlock.rst | 156 ++++++++++++++++++++-
> > 3 files changed, 245 insertions(+), 10 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/LSM/landlock.rst b/Documentation/admin-guide/LSM/landlock.rst
> > index 9923874e2156..99c6a599ce9e 100644
> > --- a/Documentation/admin-guide/LSM/landlock.rst
> > +++ b/Documentation/admin-guide/LSM/landlock.rst
> > @@ -6,7 +6,7 @@ Landlock: system-wide management
> > ================================
> >
> > :Author: Mickaël Salaün
> > -:Date: January 2026
> > +:Date: March 2026
> >
> > Landlock can leverage the audit framework to log events.
> >
> > @@ -59,14 +59,25 @@ AUDIT_LANDLOCK_ACCESS
> > - scope.abstract_unix_socket - Abstract UNIX socket connection denied
> > - scope.signal - Signal sending denied
> >
> > + **perm.*** - Permission restrictions (ABI 9+):
> > + - perm.namespace_enter - Namespace entry was denied (creation via
> > + :manpage:`unshare(2)` / :manpage:`clone(2)` or joining via
> > + :manpage:`setns(2)`);
> > + ``namespace_type`` indicates the type (hex CLONE_NEW* bitmask),
> > + ``namespace_inum`` identifies the target namespace for
> > + :manpage:`setns(2)` operations
> > + - perm.capability_use - Capability use was denied;
> > + ``capability`` indicates the capability number
> > +
> > Multiple blockers can appear in a single event (comma-separated) when
> > multiple access rights are missing. For example, creating a regular file
> > in a directory that lacks both ``make_reg`` and ``refer`` rights would show
> > ``blockers=fs.make_reg,fs.refer``.
> >
> > - The object identification fields (path, dev, ino for filesystem; opid,
> > - ocomm for signals) depend on the type of access being blocked and provide
> > - context about what resource was involved in the denial.
> > + The object identification fields depend on the type of access being blocked:
> > + ``path``, ``dev``, ``ino`` for filesystem; ``opid``, ``ocomm`` for signals;
> > + ``namespace_type`` and ``namespace_inum`` for namespace operations;
> > + ``capability`` for capability use.
> >
> >
> > AUDIT_LANDLOCK_DOMAIN
> > diff --git a/Documentation/security/landlock.rst b/Documentation/security/landlock.rst
> > index 3e4d4d04cfae..cd3d640ca5c9 100644
> > --- a/Documentation/security/landlock.rst
> > +++ b/Documentation/security/landlock.rst
> > @@ -7,7 +7,7 @@ Landlock LSM: kernel documentation
> > ==================================
> >
> > :Author: Mickaël Salaün
> > -:Date: September 2025
> > +:Date: March 2026
> >
> > Landlock's goal is to create scoped access-control (i.e. sandboxing). To
> > harden a whole system, this feature should be available to any process,
> > @@ -89,6 +89,72 @@ this is required to keep access controls consistent over the whole system, and
> > this avoids unattended bypasses through file descriptor passing (i.e. confused
> > deputy attack).
> >
> > +Composability with user namespaces
> > +----------------------------------
> > +
> > +Landlock domain-based scoping and the kernel's user namespace-based capability
> > +scoping enforce isolation over independent hierarchies. Landlock checks domain
> > +ancestry; the kernel's ``ns_capable()`` checks user namespace ancestry. These
> > +hierarchies are orthogonal: Landlock enforcement is deterministic with respect
> > +to its own configuration, regardless of namespace or capability state, and vice
> > +versa. This orthogonality is a design invariant that must hold for all new
> > +scoped features.
> The last sentence on orthogonality may better belong under the restriction
> model section for scoped access rights. I assume that future scopes must
> also be deterministic with respect to landlock's configuration as well,
> not just user namespaces.
Correct
> > +
> > +Ruleset restriction models
> > +--------------------------
> +1
>
> This section is very helpful for aligning new features with a particular
> model.
Thanks
>
> > +
> > +Landlock provides three restriction models, each with different coverage
> > +and compatibility properties.
> Maybe add:
>
> Each restriction model below corresponds to one or more fields of
> ``struct landlock_ruleset_attr``.
Ok
>
> > +
> > +Access rights (``handled_access_*``)
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Access rights control **enumerated operations on kernel objects**
> > +identified by a rule key (a file hierarchy or a network port). Each
> > +``handled_access_*`` field declares a set of access rights that the
> > +ruleset restricts. Multiple access rights share a single rule type.
> > +Operations for which no access right exists yet remain uncontrolled;
> > +new rights are added incrementally across ABI versions.
> > +
> > +Permissions (``handled_perm``)
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Permissions control **broad operations enforced at single kernel
> > +chokepoints**, achieving complete deny-by-default coverage. Each
> > +``LANDLOCK_PERM_*`` flag maps to its own rule type. When a ruleset
> > +handles a permission, all instances of that operation are denied unless
> > +explicitly allowed by a rule. New kernel values (new ``CAP_*``
> > +capabilities, new ``CLONE_NEW*`` namespace types) are automatically
> > +denied without any Landlock update.
> > +
> > +Each permission flag names a single gateway operation whose control
> > +transitively covers an open-ended set of downstream operations: for
> > +example, exercising a capability enables privileged operations across
> > +many subsystems; entering a namespace enables gaining capabilities in a
> > +new context.
> > +
> > +Permission rules identify what to allow using constants defined by other
> > +kernel subsystems (``CAP_*``, ``CLONE_NEW*``). Unknown values are
> > +silently ignored because deny-by-default ensures they are denied anyway.
> > +In contrast, unknown ``LANDLOCK_PERM_*`` flags in ``handled_perm`` are
> > +rejected (``-EINVAL``), since Landlock owns that namespace.
> > +
> > +Scopes (``scoped``)
> > +~~~~~~~~~~~~~~~~~~~~
> > +
> > +Scopes restrict **cross-domain interactions** categorically, without
> > +rules. Setting a scope flag (e.g. ``LANDLOCK_SCOPE_SIGNAL``) denies the
> > +operation to targets outside the Landlock domain or its children. Like
> > +permissions, scopes provide complete coverage of the controlled
> > +operation.
> > +
> > +When adding new Landlock features, new operations on existing rule types
> > +extend the corresponding ``handled_access_*`` field (e.g. a new
> > +filesystem operation extends ``handled_access_fs``). A new object
> > +category with multiple fine-grained operations would use a new
> > +``handled_access_*`` field. New rule types that control a single
> > +chokepoint operation use ``handled_perm``.
> > +
> > Tests
> > =====
> >
> > @@ -110,6 +176,18 @@ Filesystem
> > .. kernel-doc:: security/landlock/fs.h
> > :identifiers:
> >
> > +Namespace
> > +---------
> > +
> > +.. kernel-doc:: security/landlock/ns.h
> > + :identifiers:
> > +
> > +Capability
> > +----------
> > +
> > +.. kernel-doc:: security/landlock/cap.h
> > + :identifiers:
> > +
> > Process credential
> > ------------------
> >
> > diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
> > index 13134bccdd39..238d30a18162 100644
> > --- a/Documentation/userspace-api/landlock.rst
> > +++ b/Documentation/userspace-api/landlock.rst
> > @@ -8,7 +8,7 @@ Landlock: unprivileged access control
> > =====================================
> >
> > :Author: Mickaël Salaün
> > -:Date: January 2026
> > +:Date: March 2026
> >
> > The goal of Landlock is to enable restriction of ambient rights (e.g. global
> > filesystem or network access) for a set of processes. Because Landlock
> > @@ -33,7 +33,7 @@ A Landlock rule describes an action on an object which the process intends to
> > perform. A set of rules is aggregated in a ruleset, which can then restrict
> > the thread enforcing it, and its future children.
> >
> > -The two existing types of rules are:
> > +The existing types of rules are:
> >
> > Filesystem rules
> > For these rules, the object is a file hierarchy,
> > @@ -44,6 +44,14 @@ Network rules (since ABI v4)
> > For these rules, the object is a TCP port,
> > and the related actions are defined with `network access rights`.
> >
> > +Capability rules (since ABI v9)
> > + For these rules, the object is a set of Linux capabilities,
> > + and the related actions are defined with `permission flags`.
> > +
> > +Namespace rules (since ABI v9)
> > + For these rules, the object is a set of namespace types,
> > + and the related actions are defined with `permission flags`.
> > +
> > Defining and enforcing a security policy
> > ----------------------------------------
> >
> > @@ -84,6 +92,9 @@ to be explicit about the denied-by-default access rights.
> > .scoped =
> > LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> > LANDLOCK_SCOPE_SIGNAL,
> > + .handled_perm =
> > + LANDLOCK_PERM_CAPABILITY_USE |
> > + LANDLOCK_PERM_NAMESPACE_ENTER,
> > };
> >
> > Because we may not know which kernel version an application will be executed
> > @@ -127,6 +138,12 @@ version, and only use the available subset of access rights:
> > /* Removes LANDLOCK_SCOPE_* for ABI < 6 */
> > ruleset_attr.scoped &= ~(LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> > LANDLOCK_SCOPE_SIGNAL);
> > + __attribute__((fallthrough));
> > + case 6:
> > + case 7:
> > + case 8:
> > + /* Removes permission support for ABI < 9 */
> > + ruleset_attr.handled_perm = 0;
> > }
> >
> > This enables the creation of an inclusive ruleset that will contain our rules.
> > @@ -191,6 +208,42 @@ number for a specific action: HTTPS connections.
> > err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
> > &net_port, 0);
> >
> > +For capability access-control, we can add rules that allow specific
> > +capabilities. For instance, to allow ``CAP_SYS_CHROOT`` (so the sandboxed
> > +process can call :manpage:`chroot(2)` inside a user namespace):
> > +
> > +.. code-block:: c
> > +
> > + struct landlock_capability_attr cap_attr = {
> > + .allowed_perm = LANDLOCK_PERM_CAPABILITY_USE,
> > + .capabilities = (1ULL << CAP_SYS_CHROOT),
> > + };
> > +
> > + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_CAPABILITY,
> > + &cap_attr, 0);
> > +
> > +For namespace access-control, we can add rules that allow entering specific
> > +namespace types (creating them via :manpage:`unshare(2)` / :manpage:`clone(2)`
> > +or joining them via :manpage:`setns(2)`). For instance, to allow creating user
> > +namespaces (which grants all capabilities inside the new namespace):
> > +
> > +.. code-block:: c
> > +
> > + struct landlock_namespace_attr ns_attr = {
> > + .allowed_perm = LANDLOCK_PERM_NAMESPACE_ENTER,
> > + .namespace_types = CLONE_NEWUSER,
> > + };
> > +
> > + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NAMESPACE,
> > + &ns_attr, 0);
> > +
> > +Together, these two rules allow an unprivileged process to create a user
> > +namespace and call :manpage:`chroot(2)` inside it, while denying all other
> > +capabilities and namespace types. User namespace creation is the one operation
> > +that does not require ``CAP_SYS_ADMIN``, so no capability rule is needed for it.
> > +See `Capability and namespace restrictions`_ for details on capability
> > +requirements.
> > +
> > When passing a non-zero ``flags`` argument to ``landlock_restrict_self()``, a
> > similar backwards compatibility check is needed for the restrict flags
> > (see sys_landlock_restrict_self() documentation for available flags):
> > @@ -354,10 +407,87 @@ The operations which can be scoped are:
> > A :manpage:`sendto(2)` on a socket which was previously connected will not
> > be restricted. This works for both datagram and stream sockets.
> >
> > -IPC scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
> > +Scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
> > If an operation is scoped within a domain, no rules can be added to allow access
> > to resources or processes outside of the scope.
> >
> > +Capability and namespace restrictions
> > +-------------------------------------
> > +
> > +See Documentation/security/landlock.rst for the design rationale behind
> > +the permission model (``handled_perm``) and how it differs from access
> > +rights (``handled_access_*``) and scopes (``scoped``).
> > +When a process creates a user namespace, the kernel grants all capabilities
> > +within that namespace. While these capabilities cannot directly bypass Landlock
> > +restrictions (Landlock enforces access controls independently of capability
> > +checks), they open kernel code paths that are normally unreachable to
> > +unprivileged users and may contain exploitable bugs.
> > +
> > +Landlock provides two complementary permissions to address this.
> > +``LANDLOCK_PERM_CAPABILITY_USE`` restricts which capabilities a process can use,
> > +even when it holds them. ``LANDLOCK_PERM_NAMESPACE_ENTER`` restricts which
> > +namespace types a process can create (via :manpage:`unshare(2)` or
> > +:manpage:`clone(2)`) or join (via :manpage:`setns(2)`). After creating a user
> > +namespace, the granted capabilities are scoped to namespaces owned by that user
> > +namespace or its descendants; to exercise a capability such as
> > +``CAP_NET_ADMIN``, the process must create a namespace of the corresponding type
> > +(e.g., a network namespace). Configuring both permissions together provides
> > +full coverage: ``LANDLOCK_PERM_CAPABILITY_USE`` restricts which capabilities are
> > +available, while ``LANDLOCK_PERM_NAMESPACE_ENTER`` restricts the namespaces in
> > +which they can be used.
> Maybe add a section on the what this does versus PR_SET_NO_NEW_PRIVS.
Hmm, what do you mean? What would be the link with this part?
>
> The difference might be obvious to people familiar with namespaces and
> capabilities, but not to many users less familiar with the subject.
>
> I could see users using the LANDLOCK_PERM_* flags erroneously
> assuming that LANDLOCK_PERM_CAPABILITY_USE is required to restrict gaining of
> new capabilities through execve(), (ie through setuid) when in fact this is
> already restricted if nnp is set.
What would be the issue if no rule allow capabilities? The most
handled_* or scoped bits are set, the better.
>
> Some clarification on this would be helpful here or where
> PR_SET_NO_NEW_PRIVS is discussed in the Landlock docs.
Ok, I'll try to add something about NNP.
> > +
> > +When a Landlock domain handles ``LANDLOCK_PERM_CAPABILITY_USE``, all Linux
> > +:manpage:`capabilities(7)` are denied by default unless a rule explicitly allows
> Nit:
>
> all Linux :manpage:`capabilities(7)`
>
> might be better as
>
> the exercise of all Linux :manpage:`capabilities(7)`
Indeed
>
> Since as pointed out before we do not restrict their precense, but their
> exercise.
> > +them. This is purely restrictive: Landlock can only deny capabilities that the
> > +traditional capability mechanism would have allowed, never grant additional ones.
> > +Rules are added with ``LANDLOCK_RULE_CAPABILITY`` using a
> > +&struct landlock_capability_attr. Each rule specifies a set of ``CAP_*`` values
> > +(as a bitmask) to allow. Capabilities above ``CAP_LAST_CAP`` are silently
> > +accepted but have no effect since the kernel never checks them; this means new
> > +capabilities introduced by future kernels are automatically denied.
> > +
> > +When a Landlock domain handles ``LANDLOCK_PERM_NAMESPACE_ENTER``, namespace
> > +creation and entry are denied by default unless a rule explicitly allows them.
> > +Rules are added with ``LANDLOCK_RULE_NAMESPACE`` using a
> > +&struct landlock_namespace_attr. Each rule specifies a set of ``CLONE_NEW*``
> > +flags to allow.
> > +
> > +In practice, unprivileged processes first create a user namespace (which requires
> > +no capability and grants all capabilities within it), then use those capabilities
> > +to create other namespace types. All non-user namespace types require
> > +``CAP_SYS_ADMIN`` for both creation and :manpage:`setns(2)` entry; mount
> > +namespace entry additionally requires ``CAP_SYS_CHROOT``. For
> > +:manpage:`setns(2)`, capabilities are checked relative to the target namespace,
> > +so a process in an ancestor user namespace naturally satisfies them; this
> > +includes joining user namespaces, which requires ``CAP_SYS_ADMIN``. When
> > +``LANDLOCK_PERM_CAPABILITY_USE`` is also handled, each of these capabilities
> > +must be explicitly allowed by a rule.
> > +
> > +When combining ``CLONE_NEWUSER`` with other ``CLONE_NEW*`` flags in a single
> > +:manpage:`unshare(2)` call, the ``CAP_SYS_ADMIN`` check targets the newly
> > +created user namespace, which is handled by ``LANDLOCK_PERM_NAMESPACE_ENTER``
> > +independently from ``LANDLOCK_PERM_CAPABILITY_USE``. Performing the user
> > +namespace creation and the additional namespace creation in two separate
> > +:manpage:`unshare(2)` calls requires a rule allowing ``CAP_SYS_ADMIN`` if the
> > +domain also handles ``LANDLOCK_PERM_CAPABILITY_USE``.
> > +
> > +More generally, Landlock domains and user namespaces form independent
> > +hierarchies: Landlock domains restrict what actions are allowed (each stacked
> > +layer narrows the permitted set), while user namespaces restrict where
> > +capabilities take effect (only within the process's own namespace and its
> > +descendants). Landlock access controls are fully determined by the domain
> > +configuration, regardless of the process's position in the user namespace
> > +hierarchy. When creating child user namespaces, it is recommended to also
> > +create a dedicated Landlock domain with restrictions relevant to each namespace
> > +context.
> > +
> > +Note that ``LANDLOCK_PERM_CAPABILITY_USE`` restricts the *use* of capabilities,
> > +not their presence in the process's credential. Capability sets can change
> > +after a domain is enforced through user namespace entry, :manpage:`execve(2)` of
> > +binaries with file capabilities, or :manpage:`capset(2)`. In all cases,
> > +:manpage:`capget(2)` will report the credential's capability sets, but any
> > +denied capability will fail with ``EPERM`` when exercised.
> > +
> > Truncating files
> > ----------------
> >
> > @@ -515,7 +645,7 @@ Access rights
> > -------------
> >
> > .. kernel-doc:: include/uapi/linux/landlock.h
> > - :identifiers: fs_access net_access scope
> > + :identifiers: fs_access net_access scope perm
> >
> > Creating a new ruleset
> > ----------------------
> > @@ -534,7 +664,8 @@ Extending a ruleset
> >
> > .. kernel-doc:: include/uapi/linux/landlock.h
> > :identifiers: landlock_rule_type landlock_path_beneath_attr
> > - landlock_net_port_attr
> > + landlock_net_port_attr landlock_capability_attr
> > + landlock_namespace_attr
> >
> > Enforcing a ruleset
> > -------------------
> > @@ -685,6 +816,21 @@ enforce Landlock rulesets across all threads of the calling process
> > using the ``LANDLOCK_RESTRICT_SELF_TSYNC`` flag passed to
> > sys_landlock_restrict_self().
> >
> > +Capability restriction (ABI < 9)
> > +--------------------------------
> > +
> > +Starting with the Landlock ABI version 9, it is possible to restrict
> > +:manpage:`capabilities(7)` with the new ``LANDLOCK_PERM_CAPABILITY_USE``
> > +permission flag and ``LANDLOCK_RULE_CAPABILITY`` rule type.
> > +
> > +Namespace restriction (ABI < 9)
> > +-------------------------------
> > +
> > +Starting with the Landlock ABI version 9, it is possible to restrict
> > +namespace creation (:manpage:`unshare(2)`, :manpage:`clone(2)`) and entry
> > +(:manpage:`setns(2)`) with the new ``LANDLOCK_PERM_NAMESPACE_ENTER`` permission
> > +flag and ``LANDLOCK_RULE_NAMESPACE`` rule type.
> > +
> > .. _kernel_support:
> >
> > Kernel support
> > --
> > 2.53.0
> >
>
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Jonathan McDowell @ 2026-04-23 14:03 UTC (permalink / raw)
To: Yeoreum Yun
Cc: Mimi Zohar, linux-security-module, linux-kernel, linux-integrity,
linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
will, noodles, sebastianene
In-Reply-To: <aeokwrC86WI7uT+K@e129823.arm.com>
On Thu, Apr 23, 2026 at 02:55:14PM +0100, Yeoreum Yun wrote:
>> On Thu, 2026-04-23 at 13:53 +0100, Jonathan McDowell wrote:
>> > On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
>> > > > > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
>> > > > > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
>> > > > > > > > > Hi Mimi,
>> > > > > > > > >
>> > > > > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
>> > > > > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
>> > > > > > > > > > > the TPM driver must be built as built-in and
>> > > > > > > > > > > must be probed before the IMA subsystem is initialized.
>> > > > > > > > > > >
>> > > > > > > > > > > However, when the TPM device operates over the FF-A protocol using
>> > > > > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
>> > > > > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
>> > > > > > > > > > > interface to the tpm_crb driver — has not yet been probed.
>> > > > > > > > > > >
>> > > > > > > > > > > To ensure the TPM device operating over the FF-A protocol with
>> > > > > > > > > > > the CRB interface is probed before IMA initialization,
>> > > > > > > > > > > the following conditions must be met:
>> > > > > > > > > > >
>> > > > > > > > > > > 1. The corresponding ffa_device must be registered,
>> > > > > > > > > > > which is done via ffa_init().
>> > > > > > > > > > >
>> > > > > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
>> > > > > > > > > > > tpm_crb_ffa_init().
>> > > > > > > > > > >
>> > > > > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
>> > > > > > > > > > > be probed successfully. (See crb_acpi_add() and
>> > > > > > > > > > > tpm_crb_ffa_init() for reference.)
>> > > > > > > > > > >
>> > > > > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
>> > > > > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
>> > > > > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
>> > > > > > > > > > >
>> > > > > > > > > > > When this occurs, probing the TPM device is deferred.
>> > > > > > > > > > > However, the deferred probe can happen after the IMA subsystem
>> > > > > > > > > > > has already been initialized, since IMA initialization is performed
>> > > > > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
>> > > > > > > > > > > at the same level.
>> > > > > > > > > > >
>> > > > > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
>> > > > > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
>> > > > > > > > > > > log though TPM device presents in the system.
>> > > > > > > > > > >
>> > > > > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
>> > > > > > > > > >
>> > > > > > > > > > A lot of change for just detecting whether ima_init() is being called on
>> > > > > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
>> > > > > > > > > > changes (e.g. ima_init_core).
>> > > > > > > > > >
>> > > > > > > > > > Please just limit the change to just calling ima_init() twice.
>> > > > > > > > >
>> > > > > > > > > My concern is that ima_update_policy_flags() will be called
>> > > > > > > > > when ima_init() is deferred -- not initialised anything.
>> > > > > > > > > though functionally, it might be okay however,
>> > > > > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
>> > > > > > > > > works logically.
>> > > > > > > > >
>> > > > > > > > > This change I think not much quite a lot. just wrapper ima_init() with
>> > > > > > > > > ima_init_core() with some error handling.
>> > > > > > > > >
>> > > > > > > > > Am I missing something?
>> > > > > > > >
>> > > > > > > > Also, if we handle in ima_init() only, but it failed with other reason,
>> > > > > > > > we shouldn't call again ima_init() in the late_initcall_sync.
>> > > > > > > >
>> > > > > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
>> > > > > > > > it by caller of ima_init().
>> > > > > > >
>> > > > > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
>> > > > > > > instead of going into TPM-bypass mode, return immediately. There are no calls
>> > > > > > > to anything else. Just call ima_init() a second time.
>> > > > > >
>> > > > > > I’m not fully convinced this is sufficient.
>> > > > > >
>> > > > > > What I meant is the case where ima_init() fails due to other
>> > > > > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
>> > > > >
>> > > > > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
>> > > > > available at late_initcall. This would be classified as a bug fix and would be
>> > > > > backported. No other changes should be included in this patch.
>> > > >
>> > > > Okay.
>> > > >
>> > > > > >
>> > > > > > I’d also like to ask again whether it is fine to call
>> > > > > > ima_update_policy_flags() and keep the notifier registered in the
>> > > > > > deferred TPM case. While this may be functionally acceptable, it seems
>> > > > > > logically questionable to do so when ima_init() has not completed.
>> > > > >
>> > > > > Other than extending the TPM, IMA should behave exactly the same whether there
>> > > > > is a TPM or goes into TPM-bypass mode.
>> > > > >
>> > > > > >
>> > > > > > There is also a possibility that a deferred case ultimately fails (e.g.
>> > > > > > deferred at late_initcall, but then failing at late_initcall_sync
>> > > > > > for another reason, even while entering TPM bypass mode). In that case,
>> > > > > > it seems more appropriate to handle this state in the caller of
>> > > > > > ima_init(), rather than inside ima_init() itself.
>> > > > >
>> > > > > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
>> > > > > bypass mode. Please don't make any other changes to the existing IMA behavior
>> > > > > and hide it here behind the late_initcall_sync change.
>> > > >
>> > > > Okay. you're talking called ima_update_policy_flags() at late_initcall
>> > > > wouldn't be not a problem even in case of late_initcall_sync's ima_init()
>> > > > get failed with "TPM-bypass mode".
>> > > >
>> > > > I see then, I'll make a patch simpler then.
>> > >
>> > > But I think in case of below situation:
>> > > - late_initcall's first ima_init() is deferred.
>> > > - late_initcall_sync try again but failed and try again with
>> > > CONFIG_IMA_DEFAULT_HASH.
>> > >
>> > > I would like to sustain init_ima_core to reduce the same code repeat
>> > > in late_initcall_sync.
>> >
>> > I think what Mimi's proposing is:
>> >
>> > If we're in late_initcall, and the TPM isn't available, return
>> > immediately with an error (the EPROBE_DEFER?), don't do any init.
>> >
>> > If we're in late_initcall_sync, either we're already initialised, so do
>> > return and nothing, or run through the entire flow, even if the TPM
>> > isn't unavailable.
>> >
>> > So ima_init() just needs to know a) if it's in the sync or non-sync mode
>> > and b) for the sync mode, if we've already done the init at
>> > non-sync.
>>
>> Thanks, Jonathan. That is exactly what I'm suggesting. Any other changes
>> should not be included in this patch. Since Yeoreum is not hearing me, feel
>> free to post a patch.
>
>I see. so what you need to is this only
>If it looks good to you. I'll send it at v3.
FWIW, I pulled the tpm_default_chip check out a level to account for the
extra init you mentioned, and have the following (completely untested or
compiled, but gives the approach):
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index d48bf0ad26f4..88fe105b7f00 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -166,6 +166,7 @@ enum lsm_order {
* @initcall_fs: LSM callback for fs_initcall setup, optional
* @initcall_device: LSM callback for device_initcall() setup, optional
* @initcall_late: LSM callback for late_initcall() setup, optional
+ * @initcall_late_sync: LSM callback for late_initcall_sync() setup, optional
*/
struct lsm_info {
const struct lsm_id *id;
@@ -181,6 +182,7 @@ struct lsm_info {
int (*initcall_fs)(void);
int (*initcall_device)(void);
int (*initcall_late)(void);
+ int (*initcall_late_sync)(void);
};
#define DEFINE_LSM(lsm) \
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index a2f34f2d8ad7..a60dfb8316d8 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -119,10 +119,6 @@ int __init ima_init(void)
{
int rc;
- ima_tpm_chip = tpm_default_chip();
- if (!ima_tpm_chip)
- pr_info("No TPM chip found, activating TPM-bypass!\n");
-
rc = integrity_init_keyring(INTEGRITY_KEYRING_IMA);
if (rc)
return rc;
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 1d6229b156fb..b60a85fa803a 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -1237,7 +1237,7 @@ static int ima_kernel_module_request(char *kmod_name)
#endif /* CONFIG_INTEGRITY_ASYMMETRIC_KEYS */
-static int __init init_ima(void)
+static int __init init_ima(bool sync)
{
int error;
@@ -1247,6 +1247,19 @@ static int __init init_ima(void)
return 0;
}
+ /* If we found the TPM during our first attempt, nothing further to do */
+ if (sync && ima_tpm_chip)
+ return 0;
+
+ ima_tpm_chip = tpm_default_chip();
+ if (!ima_tpm_chip && !sync) {
+ pr_debug("TPM not available, will try later\n");
+ return -EPROBE_DEFER;
+ }
+
+ if (!ima_tpm_chip)
+ pr_info("No TPM chip found, activating TPM-bypass!\n");
+
ima_appraise_parse_cmdline();
ima_init_template_list();
hash_setup(CONFIG_IMA_DEFAULT_HASH);
@@ -1274,6 +1287,16 @@ static int __init init_ima(void)
return error;
}
+static int __init init_ima_late(void)
+{
+ return init_ima(false);
+}
+
+static int __init init_ima_late_sync(void)
+{
+ return init_ima(true);
+}
+
static struct security_hook_list ima_hooks[] __ro_after_init = {
LSM_HOOK_INIT(bprm_check_security, ima_bprm_check),
LSM_HOOK_INIT(bprm_creds_for_exec, ima_bprm_creds_for_exec),
@@ -1319,6 +1342,7 @@ DEFINE_LSM(ima) = {
.init = init_ima_lsm,
.order = LSM_ORDER_LAST,
.blobs = &ima_blob_sizes,
- /* Start IMA after the TPM is available */
- .initcall_late = init_ima,
+ /* Ensure we start IMA after the TPM is available */
+ .initcall_late = init_ima_late,
+ .initcall_late_sync = init_ima_late_sync,
};
diff --git a/security/lsm_init.c b/security/lsm_init.c
index 573e2a7250c4..4e5c59beb82a 100644
--- a/security/lsm_init.c
+++ b/security/lsm_init.c
@@ -547,13 +547,22 @@ device_initcall(security_initcall_device);
* security_initcall_late - Run the LSM late initcalls
*/
static int __init security_initcall_late(void)
+{
+ return lsm_initcall(late);
+}
+late_initcall(security_initcall_late);
+
+/**
+ * security_initcall_late_sync - Run the LSM late initcalls sync
+ */
+static int __init security_initcall_late_sync(void)
{
int rc;
- rc = lsm_initcall(late);
+ rc = lsm_initcall(late_sync);
lsm_pr_dbg("all enabled LSMs fully activated\n");
call_blocking_lsm_notifier(LSM_STARTED_ALL, NULL);
return rc;
}
-late_initcall(security_initcall_late);
+late_initcall_sync(security_initcall_late_sync);
J.
--
Rock and roll stops the traffic.
This .sig brought to you by the letter A and the number 40
Product of the Republic of HuggieTag
^ permalink raw reply related
* Re: [PATCH v2 0/4] Firmware LSM hook
From: Leon Romanovsky @ 2026-04-23 14:09 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Paul Moore, Roberto Sassu, KP Singh, Matt Bobrowski,
Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
Saeed Mahameed, Itay Avraham, Dave Jiang, Jonathan Cameron, bpf,
linux-kernel, linux-kselftest, linux-rdma, Chiara Meiohas,
Maher Sanalla, linux-security-module
In-Reply-To: <20260417191749.GK2577880@ziepe.ca>
On Fri, Apr 17, 2026 at 04:17:49PM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 15, 2026 at 05:40:04PM -0400, Paul Moore wrote:
<...>
> > Leon mentioned that different firmware revisions would have different
> > parameters for a given opcode, and that one would need to inspect
> > those parameters to properly filter the command. Is that not true, or
> > am I misreading or misunderstanding Leon's comments?
>
> They are ABI stable, so there will be rules about future changes that
> old software can follow to ignore or reject future things it doesn't
> understand.
It is wishful thinking and applicable only to mlx5 devices. No one
promises that other devices follow same ABI rules.
Thanks
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Yeoreum Yun @ 2026-04-23 14:33 UTC (permalink / raw)
To: Jonathan McDowell
Cc: Mimi Zohar, linux-security-module, linux-kernel, linux-integrity,
linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
will, noodles, sebastianene
In-Reply-To: <aeomlp3I0eVE5mce@earth.li>
Hi Jonathan,
> * # Be careful, this email looks suspicious; * Out of Character: The sender is exhibiting a significant deviation from their usual behavior, this may indicate that their account has been compromised. Be extra cautious before opening links or attachments. *
> On Thu, Apr 23, 2026 at 02:55:14PM +0100, Yeoreum Yun wrote:
> > > On Thu, 2026-04-23 at 13:53 +0100, Jonathan McDowell wrote:
> > > > On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
> > > > > > > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > > > > > > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > Hi Mimi,
> > > > > > > > > > >
> > > > > > > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > > > > > > > > the TPM driver must be built as built-in and
> > > > > > > > > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > > > > > > > >
> > > > > > > > > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > > > > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > > > > > > > > the CRB interface is probed before IMA initialization,
> > > > > > > > > > > > > the following conditions must be met:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. The corresponding ffa_device must be registered,
> > > > > > > > > > > > > which is done via ffa_init().
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
> > > > > > > > > > > > > tpm_crb_ffa_init().
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
> > > > > > > > > > > > > be probed successfully. (See crb_acpi_add() and
> > > > > > > > > > > > > tpm_crb_ffa_init() for reference.)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > > > > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > > > > > > > > has already been initialized, since IMA initialization is performed
> > > > > > > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > > > > > > > > at the same level.
> > > > > > > > > > > > >
> > > > > > > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > > > > > > > > log though TPM device presents in the system.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > > > > > > > >
> > > > > > > > > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > > > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > > > > > > > > changes (e.g. ima_init_core).
> > > > > > > > > > > >
> > > > > > > > > > > > Please just limit the change to just calling ima_init() twice.
> > > > > > > > > > >
> > > > > > > > > > > My concern is that ima_update_policy_flags() will be called
> > > > > > > > > > > when ima_init() is deferred -- not initialised anything.
> > > > > > > > > > > though functionally, it might be okay however,
> > > > > > > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > > > > > > > > works logically.
> > > > > > > > > > >
> > > > > > > > > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > > > > > > > > ima_init_core() with some error handling.
> > > > > > > > > > >
> > > > > > > > > > > Am I missing something?
> > > > > > > > > >
> > > > > > > > > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > > > > > > > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > > > > > > > >
> > > > > > > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > > > > > > > > it by caller of ima_init().
> > > > > > > > >
> > > > > > > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
> > > > > > > > > instead of going into TPM-bypass mode, return immediately. There are no calls
> > > > > > > > > to anything else. Just call ima_init() a second time.
> > > > > > > >
> > > > > > > > I’m not fully convinced this is sufficient.
> > > > > > > >
> > > > > > > > What I meant is the case where ima_init() fails due to other
> > > > > > > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
> > > > > > >
> > > > > > > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
> > > > > > > available at late_initcall. This would be classified as a bug fix and would be
> > > > > > > backported. No other changes should be included in this patch.
> > > > > >
> > > > > > Okay.
> > > > > >
> > > > > > > >
> > > > > > > > I’d also like to ask again whether it is fine to call
> > > > > > > > ima_update_policy_flags() and keep the notifier registered in the
> > > > > > > > deferred TPM case. While this may be functionally acceptable, it seems
> > > > > > > > logically questionable to do so when ima_init() has not completed.
> > > > > > >
> > > > > > > Other than extending the TPM, IMA should behave exactly the same whether there
> > > > > > > is a TPM or goes into TPM-bypass mode.
> > > > > > >
> > > > > > > >
> > > > > > > > There is also a possibility that a deferred case ultimately fails (e.g.
> > > > > > > > deferred at late_initcall, but then failing at late_initcall_sync
> > > > > > > > for another reason, even while entering TPM bypass mode). In that case,
> > > > > > > > it seems more appropriate to handle this state in the caller of
> > > > > > > > ima_init(), rather than inside ima_init() itself.
> > > > > > >
> > > > > > > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
> > > > > > > bypass mode. Please don't make any other changes to the existing IMA behavior
> > > > > > > and hide it here behind the late_initcall_sync change.
> > > > > >
> > > > > > Okay. you're talking called ima_update_policy_flags() at late_initcall
> > > > > > wouldn't be not a problem even in case of late_initcall_sync's ima_init()
> > > > > > get failed with "TPM-bypass mode".
> > > > > >
> > > > > > I see then, I'll make a patch simpler then.
> > > > >
> > > > > But I think in case of below situation:
> > > > > - late_initcall's first ima_init() is deferred.
> > > > > - late_initcall_sync try again but failed and try again with
> > > > > CONFIG_IMA_DEFAULT_HASH.
> > > > >
> > > > > I would like to sustain init_ima_core to reduce the same code repeat
> > > > > in late_initcall_sync.
> > > >
> > > > I think what Mimi's proposing is:
> > > >
> > > > If we're in late_initcall, and the TPM isn't available, return
> > > > immediately with an error (the EPROBE_DEFER?), don't do any init.
> > > >
> > > > If we're in late_initcall_sync, either we're already initialised, so do
> > > > return and nothing, or run through the entire flow, even if the TPM
> > > > isn't unavailable.
> > > >
> > > > So ima_init() just needs to know a) if it's in the sync or non-sync mode
> > > > and b) for the sync mode, if we've already done the init at
> > > > non-sync.
> > >
> > > Thanks, Jonathan. That is exactly what I'm suggesting. Any other changes
> > > should not be included in this patch. Since Yeoreum is not hearing me, feel
> > > free to post a patch.
> >
> > I see. so what you need to is this only
> > If it looks good to you. I'll send it at v3.
>
> FWIW, I pulled the tpm_default_chip check out a level to account for the
> extra init you mentioned, and have the following (completely untested or
> compiled, but gives the approach):
>
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index d48bf0ad26f4..88fe105b7f00 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -166,6 +166,7 @@ enum lsm_order {
> * @initcall_fs: LSM callback for fs_initcall setup, optional
> * @initcall_device: LSM callback for device_initcall() setup, optional
> * @initcall_late: LSM callback for late_initcall() setup, optional
> + * @initcall_late_sync: LSM callback for late_initcall_sync() setup, optional
> */
> struct lsm_info {
> const struct lsm_id *id;
> @@ -181,6 +182,7 @@ struct lsm_info {
> int (*initcall_fs)(void);
> int (*initcall_device)(void);
> int (*initcall_late)(void);
> + int (*initcall_late_sync)(void);
> };
> #define DEFINE_LSM(lsm) \
> diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
> index a2f34f2d8ad7..a60dfb8316d8 100644
> --- a/security/integrity/ima/ima_init.c
> +++ b/security/integrity/ima/ima_init.c
> @@ -119,10 +119,6 @@ int __init ima_init(void)
> {
> int rc;
> - ima_tpm_chip = tpm_default_chip();
> - if (!ima_tpm_chip)
> - pr_info("No TPM chip found, activating TPM-bypass!\n");
> -
> rc = integrity_init_keyring(INTEGRITY_KEYRING_IMA);
> if (rc)
> return rc;
> diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
> index 1d6229b156fb..b60a85fa803a 100644
> --- a/security/integrity/ima/ima_main.c
> +++ b/security/integrity/ima/ima_main.c
> @@ -1237,7 +1237,7 @@ static int ima_kernel_module_request(char *kmod_name)
> #endif /* CONFIG_INTEGRITY_ASYMMETRIC_KEYS */
> -static int __init init_ima(void)
> +static int __init init_ima(bool sync)
> {
> int error;
> @@ -1247,6 +1247,19 @@ static int __init init_ima(void)
> return 0;
> }
> + /* If we found the TPM during our first attempt, nothing further to do */
> + if (sync && ima_tpm_chip)
> + return 0;
> +
> + ima_tpm_chip = tpm_default_chip();
> + if (!ima_tpm_chip && !sync) {
> + pr_debug("TPM not available, will try later\n");
> + return -EPROBE_DEFER;
> + }
> +
> + if (!ima_tpm_chip)
> + pr_info("No TPM chip found, activating TPM-bypass!\n");
> +
> ima_appraise_parse_cmdline();
> ima_init_template_list();
> hash_setup(CONFIG_IMA_DEFAULT_HASH);
> @@ -1274,6 +1287,16 @@ static int __init init_ima(void)
> return error;
> }
> +static int __init init_ima_late(void)
> +{
> + return init_ima(false);
> +}
> +
> +static int __init init_ima_late_sync(void)
> +{
> + return init_ima(true);
> +}
> +
> static struct security_hook_list ima_hooks[] __ro_after_init = {
> LSM_HOOK_INIT(bprm_check_security, ima_bprm_check),
> LSM_HOOK_INIT(bprm_creds_for_exec, ima_bprm_creds_for_exec),
> @@ -1319,6 +1342,7 @@ DEFINE_LSM(ima) = {
> .init = init_ima_lsm,
> .order = LSM_ORDER_LAST,
> .blobs = &ima_blob_sizes,
> - /* Start IMA after the TPM is available */
> - .initcall_late = init_ima,
> + /* Ensure we start IMA after the TPM is available */
> + .initcall_late = init_ima_late,
> + .initcall_late_sync = init_ima_late_sync,
> };
> diff --git a/security/lsm_init.c b/security/lsm_init.c
> index 573e2a7250c4..4e5c59beb82a 100644
> --- a/security/lsm_init.c
> +++ b/security/lsm_init.c
> @@ -547,13 +547,22 @@ device_initcall(security_initcall_device);
> * security_initcall_late - Run the LSM late initcalls
> */
> static int __init security_initcall_late(void)
> +{
> + return lsm_initcall(late);
> +}
> +late_initcall(security_initcall_late);
> +
> +/**
> + * security_initcall_late_sync - Run the LSM late initcalls sync
> + */
> +static int __init security_initcall_late_sync(void)
> {
> int rc;
> - rc = lsm_initcall(late);
> + rc = lsm_initcall(late_sync);
> lsm_pr_dbg("all enabled LSMs fully activated\n");
> call_blocking_lsm_notifier(LSM_STARTED_ALL, NULL);
> return rc;
> }
> -late_initcall(security_initcall_late);
> +late_initcall_sync(security_initcall_late_sync);
I'm fine this. but are we talking about "ima_init()" not "init_ima()"?
Because of this, I've fixuated and make a long stupid speaking myself.
If this seems good to Mimi, I don't care who send it.
But If you're going to send this, could you includes 2 and 3 too?
Thanks.
--
Sincerely,
Yeoreum Yun
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-23 14:48 UTC (permalink / raw)
To: Jonathan McDowell, Yeoreum Yun
Cc: linux-security-module, linux-kernel, linux-integrity,
linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
will, noodles, sebastianene
In-Reply-To: <aeomlp3I0eVE5mce@earth.li>
On Thu, 2026-04-23 at 15:03 +0100, Jonathan McDowell wrote:
> On Thu, Apr 23, 2026 at 02:55:14PM +0100, Yeoreum Yun wrote:
> > > On Thu, 2026-04-23 at 13:53 +0100, Jonathan McDowell wrote:
> > > > On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
> > > > > > > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > > > > > > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > Hi Mimi,
> > > > > > > > > > >
> > > > > > > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > > > > > > > > the TPM driver must be built as built-in and
> > > > > > > > > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > > > > > > > >
> > > > > > > > > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > > > > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > > > > > > > > the CRB interface is probed before IMA initialization,
> > > > > > > > > > > > > the following conditions must be met:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. The corresponding ffa_device must be registered,
> > > > > > > > > > > > > which is done via ffa_init().
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
> > > > > > > > > > > > > tpm_crb_ffa_init().
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
> > > > > > > > > > > > > be probed successfully. (See crb_acpi_add() and
> > > > > > > > > > > > > tpm_crb_ffa_init() for reference.)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > > > > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > > > > > > > > has already been initialized, since IMA initialization is performed
> > > > > > > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > > > > > > > > at the same level.
> > > > > > > > > > > > >
> > > > > > > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > > > > > > > > log though TPM device presents in the system.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > > > > > > > >
> > > > > > > > > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > > > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > > > > > > > > changes (e.g. ima_init_core).
> > > > > > > > > > > >
> > > > > > > > > > > > Please just limit the change to just calling ima_init() twice.
> > > > > > > > > > >
> > > > > > > > > > > My concern is that ima_update_policy_flags() will be called
> > > > > > > > > > > when ima_init() is deferred -- not initialised anything.
> > > > > > > > > > > though functionally, it might be okay however,
> > > > > > > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > > > > > > > > works logically.
> > > > > > > > > > >
> > > > > > > > > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > > > > > > > > ima_init_core() with some error handling.
> > > > > > > > > > >
> > > > > > > > > > > Am I missing something?
> > > > > > > > > >
> > > > > > > > > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > > > > > > > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > > > > > > > >
> > > > > > > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > > > > > > > > it by caller of ima_init().
> > > > > > > > >
> > > > > > > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
> > > > > > > > > instead of going into TPM-bypass mode, return immediately. There are no calls
> > > > > > > > > to anything else. Just call ima_init() a second time.
> > > > > > > >
> > > > > > > > I’m not fully convinced this is sufficient.
> > > > > > > >
> > > > > > > > What I meant is the case where ima_init() fails due to other
> > > > > > > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
> > > > > > >
> > > > > > > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
> > > > > > > available at late_initcall. This would be classified as a bug fix and would be
> > > > > > > backported. No other changes should be included in this patch.
> > > > > >
> > > > > > Okay.
> > > > > >
> > > > > > > >
> > > > > > > > I’d also like to ask again whether it is fine to call
> > > > > > > > ima_update_policy_flags() and keep the notifier registered in the
> > > > > > > > deferred TPM case. While this may be functionally acceptable, it seems
> > > > > > > > logically questionable to do so when ima_init() has not completed.
> > > > > > >
> > > > > > > Other than extending the TPM, IMA should behave exactly the same whether there
> > > > > > > is a TPM or goes into TPM-bypass mode.
> > > > > > >
> > > > > > > >
> > > > > > > > There is also a possibility that a deferred case ultimately fails (e.g.
> > > > > > > > deferred at late_initcall, but then failing at late_initcall_sync
> > > > > > > > for another reason, even while entering TPM bypass mode). In that case,
> > > > > > > > it seems more appropriate to handle this state in the caller of
> > > > > > > > ima_init(), rather than inside ima_init() itself.
> > > > > > >
> > > > > > > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
> > > > > > > bypass mode. Please don't make any other changes to the existing IMA behavior
> > > > > > > and hide it here behind the late_initcall_sync change.
> > > > > >
> > > > > > Okay. you're talking called ima_update_policy_flags() at late_initcall
> > > > > > wouldn't be not a problem even in case of late_initcall_sync's ima_init()
> > > > > > get failed with "TPM-bypass mode".
> > > > > >
> > > > > > I see then, I'll make a patch simpler then.
> > > > >
> > > > > But I think in case of below situation:
> > > > > - late_initcall's first ima_init() is deferred.
> > > > > - late_initcall_sync try again but failed and try again with
> > > > > CONFIG_IMA_DEFAULT_HASH.
> > > > >
> > > > > I would like to sustain init_ima_core to reduce the same code repeat
> > > > > in late_initcall_sync.
> > > >
> > > > I think what Mimi's proposing is:
> > > >
> > > > If we're in late_initcall, and the TPM isn't available, return
> > > > immediately with an error (the EPROBE_DEFER?), don't do any init.
> > > >
> > > > If we're in late_initcall_sync, either we're already initialised, so do
> > > > return and nothing, or run through the entire flow, even if the TPM
> > > > isn't unavailable.
> > > >
> > > > So ima_init() just needs to know a) if it's in the sync or non-sync mode
> > > > and b) for the sync mode, if we've already done the init at
> > > > non-sync.
> > >
> > > Thanks, Jonathan. That is exactly what I'm suggesting. Any other changes
> > > should not be included in this patch. Since Yeoreum is not hearing me, feel
> > > free to post a patch.
> >
> > I see. so what you need to is this only
> > If it looks good to you. I'll send it at v3.
>
> FWIW, I pulled the tpm_default_chip check out a level to account for the
> extra init you mentioned, and have the following (completely untested or
> compiled, but gives the approach):
Thanks, Jonathan! It looks good. Similarly untested/compiled.
Emitting a message on failure to initialize IMA at late_initcall is good, but
the attestation service won't know. Could you somehow differentiate between the
late_initcall and late_initcall_sync boot_aggregate records?
Mimi
^ permalink raw reply
* Re: [RFC PATCH v1 11/11] landlock: Add documentation for capability and namespace restrictions
From: Justin Suess @ 2026-04-23 16:01 UTC (permalink / raw)
To: Mickaël Salaün
Cc: Christian Brauner, Günther Noack, Paul Moore,
Serge E . Hallyn, Lennart Poettering, Mikhail Ivanov,
Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang, kernel-team,
linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260423.xai2Pe3theiw@digikod.net>
On Thu, Apr 23, 2026 at 03:51:32PM +0200, Mickaël Salaün wrote:
> On Thu, Mar 12, 2026 at 10:48:42AM -0400, Justin Suess wrote:
> > On Thu, Mar 12, 2026 at 11:04:44AM +0100, Mickaël Salaün wrote:
> > > Document the two new Landlock permission categories in the userspace
> > > API guide, admin guide, and kernel security documentation.
> > >
> > > The userspace API guide adds sections on capability restriction
> > > (LANDLOCK_PERM_CAPABILITY_USE with LANDLOCK_RULE_CAPABILITY), namespace
> > > restriction (LANDLOCK_PERM_NAMESPACE_ENTER with LANDLOCK_RULE_NAMESPACE
> > > covering creation via unshare/clone and entry via setns), and the
> > > backward-compatible degradation pattern for ABI < 9. A table documents
> > > the per-namespace-type capability requirements for both creation and
> > > entry.
> > >
> > > The admin guide adds the new perm.namespace_enter and
> > > perm.capability_use audit blocker names with their object identification
> > > fields (namespace_type, namespace_inum, capability).
> > >
> > > The kernel security documentation adds a "Ruleset restriction models"
> > > section defining the three models (handled_access_*, handled_perm,
> > > scoped), their coverage and compatibility properties, and the criteria
> > > for choosing between them for future features. It also documents
> > > composability with user namespaces and adds kernel-doc references for
> > > the new capability and namespace headers.
> > >
> > > Cc: Christian Brauner <brauner@kernel.org>
> > > Cc: Günther Noack <gnoack@google.com>
> > > Cc: Paul Moore <paul@paul-moore.com>
> > > Cc: Serge E. Hallyn <serge@hallyn.com>
> > > Signed-off-by: Mickaël Salaün <mic@digikod.net>
> > > ---
> > > Documentation/admin-guide/LSM/landlock.rst | 19 ++-
> > > Documentation/security/landlock.rst | 80 ++++++++++-
> > > Documentation/userspace-api/landlock.rst | 156 ++++++++++++++++++++-
> > > 3 files changed, 245 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/LSM/landlock.rst b/Documentation/admin-guide/LSM/landlock.rst
> > > index 9923874e2156..99c6a599ce9e 100644
> > > --- a/Documentation/admin-guide/LSM/landlock.rst
> > > +++ b/Documentation/admin-guide/LSM/landlock.rst
> > > @@ -6,7 +6,7 @@ Landlock: system-wide management
> > > ================================
> > >
> > > :Author: Mickaël Salaün
> > > -:Date: January 2026
> > > +:Date: March 2026
> > >
> > > Landlock can leverage the audit framework to log events.
> > >
> > > @@ -59,14 +59,25 @@ AUDIT_LANDLOCK_ACCESS
> > > - scope.abstract_unix_socket - Abstract UNIX socket connection denied
> > > - scope.signal - Signal sending denied
> > >
> > > + **perm.*** - Permission restrictions (ABI 9+):
> > > + - perm.namespace_enter - Namespace entry was denied (creation via
> > > + :manpage:`unshare(2)` / :manpage:`clone(2)` or joining via
> > > + :manpage:`setns(2)`);
> > > + ``namespace_type`` indicates the type (hex CLONE_NEW* bitmask),
> > > + ``namespace_inum`` identifies the target namespace for
> > > + :manpage:`setns(2)` operations
> > > + - perm.capability_use - Capability use was denied;
> > > + ``capability`` indicates the capability number
> > > +
> > > Multiple blockers can appear in a single event (comma-separated) when
> > > multiple access rights are missing. For example, creating a regular file
> > > in a directory that lacks both ``make_reg`` and ``refer`` rights would show
> > > ``blockers=fs.make_reg,fs.refer``.
> > >
> > > - The object identification fields (path, dev, ino for filesystem; opid,
> > > - ocomm for signals) depend on the type of access being blocked and provide
> > > - context about what resource was involved in the denial.
> > > + The object identification fields depend on the type of access being blocked:
> > > + ``path``, ``dev``, ``ino`` for filesystem; ``opid``, ``ocomm`` for signals;
> > > + ``namespace_type`` and ``namespace_inum`` for namespace operations;
> > > + ``capability`` for capability use.
> > >
> > >
> > > AUDIT_LANDLOCK_DOMAIN
> > > diff --git a/Documentation/security/landlock.rst b/Documentation/security/landlock.rst
> > > index 3e4d4d04cfae..cd3d640ca5c9 100644
> > > --- a/Documentation/security/landlock.rst
> > > +++ b/Documentation/security/landlock.rst
> > > @@ -7,7 +7,7 @@ Landlock LSM: kernel documentation
> > > ==================================
> > >
> > > :Author: Mickaël Salaün
> > > -:Date: September 2025
> > > +:Date: March 2026
> > >
> > > Landlock's goal is to create scoped access-control (i.e. sandboxing). To
> > > harden a whole system, this feature should be available to any process,
> > > @@ -89,6 +89,72 @@ this is required to keep access controls consistent over the whole system, and
> > > this avoids unattended bypasses through file descriptor passing (i.e. confused
> > > deputy attack).
> > >
> > > +Composability with user namespaces
> > > +----------------------------------
> > > +
> > > +Landlock domain-based scoping and the kernel's user namespace-based capability
> > > +scoping enforce isolation over independent hierarchies. Landlock checks domain
> > > +ancestry; the kernel's ``ns_capable()`` checks user namespace ancestry. These
> > > +hierarchies are orthogonal: Landlock enforcement is deterministic with respect
> > > +to its own configuration, regardless of namespace or capability state, and vice
> > > +versa. This orthogonality is a design invariant that must hold for all new
> > > +scoped features.
> > The last sentence on orthogonality may better belong under the restriction
> > model section for scoped access rights. I assume that future scopes must
> > also be deterministic with respect to landlock's configuration as well,
> > not just user namespaces.
>
> Correct
>
> > > +
> > > +Ruleset restriction models
> > > +--------------------------
> > +1
> >
> > This section is very helpful for aligning new features with a particular
> > model.
>
> Thanks
>
> >
> > > +
> > > +Landlock provides three restriction models, each with different coverage
> > > +and compatibility properties.
> > Maybe add:
> >
> > Each restriction model below corresponds to one or more fields of
> > ``struct landlock_ruleset_attr``.
>
> Ok
>
> >
> > > +
> > > +Access rights (``handled_access_*``)
> > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +Access rights control **enumerated operations on kernel objects**
> > > +identified by a rule key (a file hierarchy or a network port). Each
> > > +``handled_access_*`` field declares a set of access rights that the
> > > +ruleset restricts. Multiple access rights share a single rule type.
> > > +Operations for which no access right exists yet remain uncontrolled;
> > > +new rights are added incrementally across ABI versions.
> > > +
> > > +Permissions (``handled_perm``)
> > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +Permissions control **broad operations enforced at single kernel
> > > +chokepoints**, achieving complete deny-by-default coverage. Each
> > > +``LANDLOCK_PERM_*`` flag maps to its own rule type. When a ruleset
> > > +handles a permission, all instances of that operation are denied unless
> > > +explicitly allowed by a rule. New kernel values (new ``CAP_*``
> > > +capabilities, new ``CLONE_NEW*`` namespace types) are automatically
> > > +denied without any Landlock update.
> > > +
> > > +Each permission flag names a single gateway operation whose control
> > > +transitively covers an open-ended set of downstream operations: for
> > > +example, exercising a capability enables privileged operations across
> > > +many subsystems; entering a namespace enables gaining capabilities in a
> > > +new context.
> > > +
> > > +Permission rules identify what to allow using constants defined by other
> > > +kernel subsystems (``CAP_*``, ``CLONE_NEW*``). Unknown values are
> > > +silently ignored because deny-by-default ensures they are denied anyway.
> > > +In contrast, unknown ``LANDLOCK_PERM_*`` flags in ``handled_perm`` are
> > > +rejected (``-EINVAL``), since Landlock owns that namespace.
> > > +
> > > +Scopes (``scoped``)
> > > +~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +Scopes restrict **cross-domain interactions** categorically, without
> > > +rules. Setting a scope flag (e.g. ``LANDLOCK_SCOPE_SIGNAL``) denies the
> > > +operation to targets outside the Landlock domain or its children. Like
> > > +permissions, scopes provide complete coverage of the controlled
> > > +operation.
> > > +
> > > +When adding new Landlock features, new operations on existing rule types
> > > +extend the corresponding ``handled_access_*`` field (e.g. a new
> > > +filesystem operation extends ``handled_access_fs``). A new object
> > > +category with multiple fine-grained operations would use a new
> > > +``handled_access_*`` field. New rule types that control a single
> > > +chokepoint operation use ``handled_perm``.
> > > +
> > > Tests
> > > =====
> > >
> > > @@ -110,6 +176,18 @@ Filesystem
> > > .. kernel-doc:: security/landlock/fs.h
> > > :identifiers:
> > >
> > > +Namespace
> > > +---------
> > > +
> > > +.. kernel-doc:: security/landlock/ns.h
> > > + :identifiers:
> > > +
> > > +Capability
> > > +----------
> > > +
> > > +.. kernel-doc:: security/landlock/cap.h
> > > + :identifiers:
> > > +
> > > Process credential
> > > ------------------
> > >
> > > diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
> > > index 13134bccdd39..238d30a18162 100644
> > > --- a/Documentation/userspace-api/landlock.rst
> > > +++ b/Documentation/userspace-api/landlock.rst
> > > @@ -8,7 +8,7 @@ Landlock: unprivileged access control
> > > =====================================
> > >
> > > :Author: Mickaël Salaün
> > > -:Date: January 2026
> > > +:Date: March 2026
> > >
> > > The goal of Landlock is to enable restriction of ambient rights (e.g. global
> > > filesystem or network access) for a set of processes. Because Landlock
> > > @@ -33,7 +33,7 @@ A Landlock rule describes an action on an object which the process intends to
> > > perform. A set of rules is aggregated in a ruleset, which can then restrict
> > > the thread enforcing it, and its future children.
> > >
> > > -The two existing types of rules are:
> > > +The existing types of rules are:
> > >
> > > Filesystem rules
> > > For these rules, the object is a file hierarchy,
> > > @@ -44,6 +44,14 @@ Network rules (since ABI v4)
> > > For these rules, the object is a TCP port,
> > > and the related actions are defined with `network access rights`.
> > >
> > > +Capability rules (since ABI v9)
> > > + For these rules, the object is a set of Linux capabilities,
> > > + and the related actions are defined with `permission flags`.
> > > +
> > > +Namespace rules (since ABI v9)
> > > + For these rules, the object is a set of namespace types,
> > > + and the related actions are defined with `permission flags`.
> > > +
> > > Defining and enforcing a security policy
> > > ----------------------------------------
> > >
> > > @@ -84,6 +92,9 @@ to be explicit about the denied-by-default access rights.
> > > .scoped =
> > > LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> > > LANDLOCK_SCOPE_SIGNAL,
> > > + .handled_perm =
> > > + LANDLOCK_PERM_CAPABILITY_USE |
> > > + LANDLOCK_PERM_NAMESPACE_ENTER,
> > > };
> > >
> > > Because we may not know which kernel version an application will be executed
> > > @@ -127,6 +138,12 @@ version, and only use the available subset of access rights:
> > > /* Removes LANDLOCK_SCOPE_* for ABI < 6 */
> > > ruleset_attr.scoped &= ~(LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> > > LANDLOCK_SCOPE_SIGNAL);
> > > + __attribute__((fallthrough));
> > > + case 6:
> > > + case 7:
> > > + case 8:
> > > + /* Removes permission support for ABI < 9 */
> > > + ruleset_attr.handled_perm = 0;
> > > }
> > >
> > > This enables the creation of an inclusive ruleset that will contain our rules.
> > > @@ -191,6 +208,42 @@ number for a specific action: HTTPS connections.
> > > err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
> > > &net_port, 0);
> > >
> > > +For capability access-control, we can add rules that allow specific
> > > +capabilities. For instance, to allow ``CAP_SYS_CHROOT`` (so the sandboxed
> > > +process can call :manpage:`chroot(2)` inside a user namespace):
> > > +
> > > +.. code-block:: c
> > > +
> > > + struct landlock_capability_attr cap_attr = {
> > > + .allowed_perm = LANDLOCK_PERM_CAPABILITY_USE,
> > > + .capabilities = (1ULL << CAP_SYS_CHROOT),
> > > + };
> > > +
> > > + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_CAPABILITY,
> > > + &cap_attr, 0);
> > > +
> > > +For namespace access-control, we can add rules that allow entering specific
> > > +namespace types (creating them via :manpage:`unshare(2)` / :manpage:`clone(2)`
> > > +or joining them via :manpage:`setns(2)`). For instance, to allow creating user
> > > +namespaces (which grants all capabilities inside the new namespace):
> > > +
> > > +.. code-block:: c
> > > +
> > > + struct landlock_namespace_attr ns_attr = {
> > > + .allowed_perm = LANDLOCK_PERM_NAMESPACE_ENTER,
> > > + .namespace_types = CLONE_NEWUSER,
> > > + };
> > > +
> > > + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NAMESPACE,
> > > + &ns_attr, 0);
> > > +
> > > +Together, these two rules allow an unprivileged process to create a user
> > > +namespace and call :manpage:`chroot(2)` inside it, while denying all other
> > > +capabilities and namespace types. User namespace creation is the one operation
> > > +that does not require ``CAP_SYS_ADMIN``, so no capability rule is needed for it.
> > > +See `Capability and namespace restrictions`_ for details on capability
> > > +requirements.
> > > +
> > > When passing a non-zero ``flags`` argument to ``landlock_restrict_self()``, a
> > > similar backwards compatibility check is needed for the restrict flags
> > > (see sys_landlock_restrict_self() documentation for available flags):
> > > @@ -354,10 +407,87 @@ The operations which can be scoped are:
> > > A :manpage:`sendto(2)` on a socket which was previously connected will not
> > > be restricted. This works for both datagram and stream sockets.
> > >
> > > -IPC scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
> > > +Scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
> > > If an operation is scoped within a domain, no rules can be added to allow access
> > > to resources or processes outside of the scope.
> > >
> > > +Capability and namespace restrictions
> > > +-------------------------------------
> > > +
> > > +See Documentation/security/landlock.rst for the design rationale behind
> > > +the permission model (``handled_perm``) and how it differs from access
> > > +rights (``handled_access_*``) and scopes (``scoped``).
> > > +When a process creates a user namespace, the kernel grants all capabilities
> > > +within that namespace. While these capabilities cannot directly bypass Landlock
> > > +restrictions (Landlock enforces access controls independently of capability
> > > +checks), they open kernel code paths that are normally unreachable to
> > > +unprivileged users and may contain exploitable bugs.
> > > +
> > > +Landlock provides two complementary permissions to address this.
> > > +``LANDLOCK_PERM_CAPABILITY_USE`` restricts which capabilities a process can use,
> > > +even when it holds them. ``LANDLOCK_PERM_NAMESPACE_ENTER`` restricts which
> > > +namespace types a process can create (via :manpage:`unshare(2)` or
> > > +:manpage:`clone(2)`) or join (via :manpage:`setns(2)`). After creating a user
> > > +namespace, the granted capabilities are scoped to namespaces owned by that user
> > > +namespace or its descendants; to exercise a capability such as
> > > +``CAP_NET_ADMIN``, the process must create a namespace of the corresponding type
> > > +(e.g., a network namespace). Configuring both permissions together provides
> > > +full coverage: ``LANDLOCK_PERM_CAPABILITY_USE`` restricts which capabilities are
> > > +available, while ``LANDLOCK_PERM_NAMESPACE_ENTER`` restricts the namespaces in
> > > +which they can be used.
> > Maybe add a section on the what this does versus PR_SET_NO_NEW_PRIVS.
>
> Hmm, what do you mean? What would be the link with this part?
PR_SET_NO_NEW_PRIVS prevents gaining of privileges through execution,
including capabilities (i.e setcap command, not just setuid/gid).
So they're adjacent at least.
Some users might not want to set NNP because they want to execute
a binary with w/ CAP_BPF file capabilities set for instance. But
they don't need CAP_SYS_ADMIN or whatever for their usecase.
There could be language saying "*hint hint* hey if you can't use NNP,
you should really be looking at the capability restrictions".
>
> >
> > The difference might be obvious to people familiar with namespaces and
> > capabilities, but not to many users less familiar with the subject.
> >
> > I could see users using the LANDLOCK_PERM_* flags erroneously
> > assuming that LANDLOCK_PERM_CAPABILITY_USE is required to restrict gaining of
> > new capabilities through execve(), (ie through setuid) when in fact this is
> > already restricted if nnp is set.
>
> What would be the issue if no rule allow capabilities? The most
> handled_* or scoped bits are set, the better.
Agreed, the more the better.
I just think it would be beneficial to mention the differences up front,
especially because NNP won't prevent exercise of existing capabilities,
but this will. So the description for this should at least touch on NNP
because they are complimentary.
I think a lot of devs that just want to add sandboxing aren't deeply
familiar with how capabilities work.
>
> >
> > Some clarification on this would be helpful here or where
> > PR_SET_NO_NEW_PRIVS is discussed in the Landlock docs.
>
> Ok, I'll try to add something about NNP.
>
> > > +
> > > +When a Landlock domain handles ``LANDLOCK_PERM_CAPABILITY_USE``, all Linux
> > > +:manpage:`capabilities(7)` are denied by default unless a rule explicitly allows
> > Nit:
> >
> > all Linux :manpage:`capabilities(7)`
> >
> > might be better as
> >
> > the exercise of all Linux :manpage:`capabilities(7)`
>
> Indeed
>
> > [...]
^ permalink raw reply
* Re: [RFC PATCH v1 11/11] landlock: Add documentation for capability and namespace restrictions
From: Justin Suess @ 2026-04-23 16:08 UTC (permalink / raw)
To: Mickaël Salaün
Cc: Christian Brauner, Günther Noack, Paul Moore,
Serge E . Hallyn, Lennart Poettering, Mikhail Ivanov,
Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang, kernel-team,
linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <aeo7m6LCE0Pi_O-V@suesslenovo>
On Thu, Apr 23, 2026 at 12:01:08PM -0400, Justin Suess wrote:
> On Thu, Apr 23, 2026 at 03:51:32PM +0200, Mickaël Salaün wrote:
> > On Thu, Mar 12, 2026 at 10:48:42AM -0400, Justin Suess wrote:
> > > On Thu, Mar 12, 2026 at 11:04:44AM +0100, Mickaël Salaün wrote:
> > > > Document the two new Landlock permission categories in the userspace
> > > > API guide, admin guide, and kernel security documentation.
> > > >
> > > > The userspace API guide adds sections on capability restriction
> > > > (LANDLOCK_PERM_CAPABILITY_USE with LANDLOCK_RULE_CAPABILITY), namespace
> > > > restriction (LANDLOCK_PERM_NAMESPACE_ENTER with LANDLOCK_RULE_NAMESPACE
> > > > covering creation via unshare/clone and entry via setns), and the
> > > > backward-compatible degradation pattern for ABI < 9. A table documents
> > > > the per-namespace-type capability requirements for both creation and
> > > > entry.
> > > >
> > > > The admin guide adds the new perm.namespace_enter and
> > > > perm.capability_use audit blocker names with their object identification
> > > > fields (namespace_type, namespace_inum, capability).
> > > >
> > > > The kernel security documentation adds a "Ruleset restriction models"
> > > > section defining the three models (handled_access_*, handled_perm,
> > > > scoped), their coverage and compatibility properties, and the criteria
> > > > for choosing between them for future features. It also documents
> > > > composability with user namespaces and adds kernel-doc references for
> > > > the new capability and namespace headers.
> > > >
> > > > Cc: Christian Brauner <brauner@kernel.org>
> > > > Cc: Günther Noack <gnoack@google.com>
> > > > Cc: Paul Moore <paul@paul-moore.com>
> > > > Cc: Serge E. Hallyn <serge@hallyn.com>
> > > > Signed-off-by: Mickaël Salaün <mic@digikod.net>
> > > > ---
> > > > Documentation/admin-guide/LSM/landlock.rst | 19 ++-
> > > > Documentation/security/landlock.rst | 80 ++++++++++-
> > > > Documentation/userspace-api/landlock.rst | 156 ++++++++++++++++++++-
> > > > 3 files changed, 245 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git a/Documentation/admin-guide/LSM/landlock.rst b/Documentation/admin-guide/LSM/landlock.rst
> > > > index 9923874e2156..99c6a599ce9e 100644
> > > > --- a/Documentation/admin-guide/LSM/landlock.rst
> > > > +++ b/Documentation/admin-guide/LSM/landlock.rst
> > > > @@ -6,7 +6,7 @@ Landlock: system-wide management
> > > > ================================
> > > >
> > > > :Author: Mickaël Salaün
> > > > -:Date: January 2026
> > > > +:Date: March 2026
> > > >
> > > > Landlock can leverage the audit framework to log events.
> > > >
> > > > @@ -59,14 +59,25 @@ AUDIT_LANDLOCK_ACCESS
> > > > - scope.abstract_unix_socket - Abstract UNIX socket connection denied
> > > > - scope.signal - Signal sending denied
> > > >
> > > > + **perm.*** - Permission restrictions (ABI 9+):
> > > > + - perm.namespace_enter - Namespace entry was denied (creation via
> > > > + :manpage:`unshare(2)` / :manpage:`clone(2)` or joining via
> > > > + :manpage:`setns(2)`);
> > > > + ``namespace_type`` indicates the type (hex CLONE_NEW* bitmask),
> > > > + ``namespace_inum`` identifies the target namespace for
> > > > + :manpage:`setns(2)` operations
> > > > + - perm.capability_use - Capability use was denied;
> > > > + ``capability`` indicates the capability number
> > > > +
> > > > Multiple blockers can appear in a single event (comma-separated) when
> > > > multiple access rights are missing. For example, creating a regular file
> > > > in a directory that lacks both ``make_reg`` and ``refer`` rights would show
> > > > ``blockers=fs.make_reg,fs.refer``.
> > > >
> > > > - The object identification fields (path, dev, ino for filesystem; opid,
> > > > - ocomm for signals) depend on the type of access being blocked and provide
> > > > - context about what resource was involved in the denial.
> > > > + The object identification fields depend on the type of access being blocked:
> > > > + ``path``, ``dev``, ``ino`` for filesystem; ``opid``, ``ocomm`` for signals;
> > > > + ``namespace_type`` and ``namespace_inum`` for namespace operations;
> > > > + ``capability`` for capability use.
> > > >
> > > >
> > > > AUDIT_LANDLOCK_DOMAIN
> > > > diff --git a/Documentation/security/landlock.rst b/Documentation/security/landlock.rst
> > > > index 3e4d4d04cfae..cd3d640ca5c9 100644
> > > > --- a/Documentation/security/landlock.rst
> > > > +++ b/Documentation/security/landlock.rst
> > > > @@ -7,7 +7,7 @@ Landlock LSM: kernel documentation
> > > > ==================================
> > > >
> > > > :Author: Mickaël Salaün
> > > > -:Date: September 2025
> > > > +:Date: March 2026
> > > >
> > > > Landlock's goal is to create scoped access-control (i.e. sandboxing). To
> > > > harden a whole system, this feature should be available to any process,
> > > > @@ -89,6 +89,72 @@ this is required to keep access controls consistent over the whole system, and
> > > > this avoids unattended bypasses through file descriptor passing (i.e. confused
> > > > deputy attack).
> > > >
> > > > +Composability with user namespaces
> > > > +----------------------------------
> > > > +
> > > > +Landlock domain-based scoping and the kernel's user namespace-based capability
> > > > +scoping enforce isolation over independent hierarchies. Landlock checks domain
> > > > +ancestry; the kernel's ``ns_capable()`` checks user namespace ancestry. These
> > > > +hierarchies are orthogonal: Landlock enforcement is deterministic with respect
> > > > +to its own configuration, regardless of namespace or capability state, and vice
> > > > +versa. This orthogonality is a design invariant that must hold for all new
> > > > +scoped features.
> > > The last sentence on orthogonality may better belong under the restriction
> > > model section for scoped access rights. I assume that future scopes must
> > > also be deterministic with respect to landlock's configuration as well,
> > > not just user namespaces.
> >
> > Correct
> >
> > > > +
> > > > +Ruleset restriction models
> > > > +--------------------------
> > > +1
> > >
> > > This section is very helpful for aligning new features with a particular
> > > model.
> >
> > Thanks
> >
> > >
> > > > +
> > > > +Landlock provides three restriction models, each with different coverage
> > > > +and compatibility properties.
> > > Maybe add:
> > >
> > > Each restriction model below corresponds to one or more fields of
> > > ``struct landlock_ruleset_attr``.
> >
> > Ok
> >
> > >
> > > > +
> > > > +Access rights (``handled_access_*``)
> > > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > > +
> > > > +Access rights control **enumerated operations on kernel objects**
> > > > +identified by a rule key (a file hierarchy or a network port). Each
> > > > +``handled_access_*`` field declares a set of access rights that the
> > > > +ruleset restricts. Multiple access rights share a single rule type.
> > > > +Operations for which no access right exists yet remain uncontrolled;
> > > > +new rights are added incrementally across ABI versions.
> > > > +
> > > > +Permissions (``handled_perm``)
> > > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > > +
> > > > +Permissions control **broad operations enforced at single kernel
> > > > +chokepoints**, achieving complete deny-by-default coverage. Each
> > > > +``LANDLOCK_PERM_*`` flag maps to its own rule type. When a ruleset
> > > > +handles a permission, all instances of that operation are denied unless
> > > > +explicitly allowed by a rule. New kernel values (new ``CAP_*``
> > > > +capabilities, new ``CLONE_NEW*`` namespace types) are automatically
> > > > +denied without any Landlock update.
> > > > +
> > > > +Each permission flag names a single gateway operation whose control
> > > > +transitively covers an open-ended set of downstream operations: for
> > > > +example, exercising a capability enables privileged operations across
> > > > +many subsystems; entering a namespace enables gaining capabilities in a
> > > > +new context.
> > > > +
> > > > +Permission rules identify what to allow using constants defined by other
> > > > +kernel subsystems (``CAP_*``, ``CLONE_NEW*``). Unknown values are
> > > > +silently ignored because deny-by-default ensures they are denied anyway.
> > > > +In contrast, unknown ``LANDLOCK_PERM_*`` flags in ``handled_perm`` are
> > > > +rejected (``-EINVAL``), since Landlock owns that namespace.
> > > > +
> > > > +Scopes (``scoped``)
> > > > +~~~~~~~~~~~~~~~~~~~~
> > > > +
> > > > +Scopes restrict **cross-domain interactions** categorically, without
> > > > +rules. Setting a scope flag (e.g. ``LANDLOCK_SCOPE_SIGNAL``) denies the
> > > > +operation to targets outside the Landlock domain or its children. Like
> > > > +permissions, scopes provide complete coverage of the controlled
> > > > +operation.
> > > > +
> > > > +When adding new Landlock features, new operations on existing rule types
> > > > +extend the corresponding ``handled_access_*`` field (e.g. a new
> > > > +filesystem operation extends ``handled_access_fs``). A new object
> > > > +category with multiple fine-grained operations would use a new
> > > > +``handled_access_*`` field. New rule types that control a single
> > > > +chokepoint operation use ``handled_perm``.
> > > > +
> > > > Tests
> > > > =====
> > > >
> > > > @@ -110,6 +176,18 @@ Filesystem
> > > > .. kernel-doc:: security/landlock/fs.h
> > > > :identifiers:
> > > >
> > > > +Namespace
> > > > +---------
> > > > +
> > > > +.. kernel-doc:: security/landlock/ns.h
> > > > + :identifiers:
> > > > +
> > > > +Capability
> > > > +----------
> > > > +
> > > > +.. kernel-doc:: security/landlock/cap.h
> > > > + :identifiers:
> > > > +
> > > > Process credential
> > > > ------------------
> > > >
> > > > diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
> > > > index 13134bccdd39..238d30a18162 100644
> > > > --- a/Documentation/userspace-api/landlock.rst
> > > > +++ b/Documentation/userspace-api/landlock.rst
> > > > @@ -8,7 +8,7 @@ Landlock: unprivileged access control
> > > > =====================================
> > > >
> > > > :Author: Mickaël Salaün
> > > > -:Date: January 2026
> > > > +:Date: March 2026
> > > >
> > > > The goal of Landlock is to enable restriction of ambient rights (e.g. global
> > > > filesystem or network access) for a set of processes. Because Landlock
> > > > @@ -33,7 +33,7 @@ A Landlock rule describes an action on an object which the process intends to
> > > > perform. A set of rules is aggregated in a ruleset, which can then restrict
> > > > the thread enforcing it, and its future children.
> > > >
> > > > -The two existing types of rules are:
> > > > +The existing types of rules are:
> > > >
> > > > Filesystem rules
> > > > For these rules, the object is a file hierarchy,
> > > > @@ -44,6 +44,14 @@ Network rules (since ABI v4)
> > > > For these rules, the object is a TCP port,
> > > > and the related actions are defined with `network access rights`.
> > > >
> > > > +Capability rules (since ABI v9)
> > > > + For these rules, the object is a set of Linux capabilities,
> > > > + and the related actions are defined with `permission flags`.
> > > > +
> > > > +Namespace rules (since ABI v9)
> > > > + For these rules, the object is a set of namespace types,
> > > > + and the related actions are defined with `permission flags`.
> > > > +
> > > > Defining and enforcing a security policy
> > > > ----------------------------------------
> > > >
> > > > @@ -84,6 +92,9 @@ to be explicit about the denied-by-default access rights.
> > > > .scoped =
> > > > LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> > > > LANDLOCK_SCOPE_SIGNAL,
> > > > + .handled_perm =
> > > > + LANDLOCK_PERM_CAPABILITY_USE |
> > > > + LANDLOCK_PERM_NAMESPACE_ENTER,
> > > > };
> > > >
> > > > Because we may not know which kernel version an application will be executed
> > > > @@ -127,6 +138,12 @@ version, and only use the available subset of access rights:
> > > > /* Removes LANDLOCK_SCOPE_* for ABI < 6 */
> > > > ruleset_attr.scoped &= ~(LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
> > > > LANDLOCK_SCOPE_SIGNAL);
> > > > + __attribute__((fallthrough));
> > > > + case 6:
> > > > + case 7:
> > > > + case 8:
> > > > + /* Removes permission support for ABI < 9 */
> > > > + ruleset_attr.handled_perm = 0;
> > > > }
> > > >
> > > > This enables the creation of an inclusive ruleset that will contain our rules.
> > > > @@ -191,6 +208,42 @@ number for a specific action: HTTPS connections.
> > > > err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
> > > > &net_port, 0);
> > > >
> > > > +For capability access-control, we can add rules that allow specific
> > > > +capabilities. For instance, to allow ``CAP_SYS_CHROOT`` (so the sandboxed
> > > > +process can call :manpage:`chroot(2)` inside a user namespace):
> > > > +
> > > > +.. code-block:: c
> > > > +
> > > > + struct landlock_capability_attr cap_attr = {
> > > > + .allowed_perm = LANDLOCK_PERM_CAPABILITY_USE,
> > > > + .capabilities = (1ULL << CAP_SYS_CHROOT),
> > > > + };
> > > > +
> > > > + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_CAPABILITY,
> > > > + &cap_attr, 0);
> > > > +
> > > > +For namespace access-control, we can add rules that allow entering specific
> > > > +namespace types (creating them via :manpage:`unshare(2)` / :manpage:`clone(2)`
> > > > +or joining them via :manpage:`setns(2)`). For instance, to allow creating user
> > > > +namespaces (which grants all capabilities inside the new namespace):
> > > > +
> > > > +.. code-block:: c
> > > > +
> > > > + struct landlock_namespace_attr ns_attr = {
> > > > + .allowed_perm = LANDLOCK_PERM_NAMESPACE_ENTER,
> > > > + .namespace_types = CLONE_NEWUSER,
> > > > + };
> > > > +
> > > > + err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NAMESPACE,
> > > > + &ns_attr, 0);
> > > > +
> > > > +Together, these two rules allow an unprivileged process to create a user
> > > > +namespace and call :manpage:`chroot(2)` inside it, while denying all other
> > > > +capabilities and namespace types. User namespace creation is the one operation
> > > > +that does not require ``CAP_SYS_ADMIN``, so no capability rule is needed for it.
> > > > +See `Capability and namespace restrictions`_ for details on capability
> > > > +requirements.
> > > > +
> > > > When passing a non-zero ``flags`` argument to ``landlock_restrict_self()``, a
> > > > similar backwards compatibility check is needed for the restrict flags
> > > > (see sys_landlock_restrict_self() documentation for available flags):
> > > > @@ -354,10 +407,87 @@ The operations which can be scoped are:
> > > > A :manpage:`sendto(2)` on a socket which was previously connected will not
> > > > be restricted. This works for both datagram and stream sockets.
> > > >
> > > > -IPC scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
> > > > +Scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
> > > > If an operation is scoped within a domain, no rules can be added to allow access
> > > > to resources or processes outside of the scope.
> > > >
> > > > +Capability and namespace restrictions
> > > > +-------------------------------------
> > > > +
> > > > +See Documentation/security/landlock.rst for the design rationale behind
> > > > +the permission model (``handled_perm``) and how it differs from access
> > > > +rights (``handled_access_*``) and scopes (``scoped``).
> > > > +When a process creates a user namespace, the kernel grants all capabilities
> > > > +within that namespace. While these capabilities cannot directly bypass Landlock
> > > > +restrictions (Landlock enforces access controls independently of capability
> > > > +checks), they open kernel code paths that are normally unreachable to
> > > > +unprivileged users and may contain exploitable bugs.
> > > > +
> > > > +Landlock provides two complementary permissions to address this.
> > > > +``LANDLOCK_PERM_CAPABILITY_USE`` restricts which capabilities a process can use,
> > > > +even when it holds them. ``LANDLOCK_PERM_NAMESPACE_ENTER`` restricts which
> > > > +namespace types a process can create (via :manpage:`unshare(2)` or
> > > > +:manpage:`clone(2)`) or join (via :manpage:`setns(2)`). After creating a user
> > > > +namespace, the granted capabilities are scoped to namespaces owned by that user
> > > > +namespace or its descendants; to exercise a capability such as
> > > > +``CAP_NET_ADMIN``, the process must create a namespace of the corresponding type
> > > > +(e.g., a network namespace). Configuring both permissions together provides
> > > > +full coverage: ``LANDLOCK_PERM_CAPABILITY_USE`` restricts which capabilities are
> > > > +available, while ``LANDLOCK_PERM_NAMESPACE_ENTER`` restricts the namespaces in
> > > > +which they can be used.
> > > Maybe add a section on the what this does versus PR_SET_NO_NEW_PRIVS.
> >
> > Hmm, what do you mean? What would be the link with this part?
> PR_SET_NO_NEW_PRIVS prevents gaining of privileges through execution,
> including capabilities (i.e setcap command, not just setuid/gid).
> So they're adjacent at least.
>
> Some users might not want to set NNP because they want to execute
> a binary with w/ CAP_BPF file capabilities set for instance. But
> they don't need CAP_SYS_ADMIN or whatever for their usecase.
>
Bad example sorry. They need CAP_SYS_ADMIN to make the ruleset. But this
point still applies for other caps or if they drop CAP_SYS_ADMIN after
applying the ruleset.
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Jonathan McDowell @ 2026-04-23 17:02 UTC (permalink / raw)
To: Mimi Zohar
Cc: Yeoreum Yun, linux-security-module, linux-kernel, linux-integrity,
linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
will, noodles, sebastianene
In-Reply-To: <2866f7679fe6933de667ce74ae68bd4ea9198e2a.camel@linux.ibm.com>
On Thu, Apr 23, 2026 at 10:48:49AM -0400, Mimi Zohar wrote:
>On Thu, 2026-04-23 at 15:03 +0100, Jonathan McDowell wrote:
>> On Thu, Apr 23, 2026 at 02:55:14PM +0100, Yeoreum Yun wrote:
>> > > On Thu, 2026-04-23 at 13:53 +0100, Jonathan McDowell wrote:
>> > > > On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
>> > > > > > > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
>> > > > > > > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
>> > > > > > > > > > > Hi Mimi,
>> > > > > > > > > > >
>> > > > > > > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
>> > > > > > > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
>> > > > > > > > > > > > > the TPM driver must be built as built-in and
>> > > > > > > > > > > > > must be probed before the IMA subsystem is initialized.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > However, when the TPM device operates over the FF-A protocol using
>> > > > > > > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
>> > > > > > > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
>> > > > > > > > > > > > > interface to the tpm_crb driver — has not yet been probed.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > To ensure the TPM device operating over the FF-A protocol with
>> > > > > > > > > > > > > the CRB interface is probed before IMA initialization,
>> > > > > > > > > > > > > the following conditions must be met:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 1. The corresponding ffa_device must be registered,
>> > > > > > > > > > > > > which is done via ffa_init().
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
>> > > > > > > > > > > > > tpm_crb_ffa_init().
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
>> > > > > > > > > > > > > be probed successfully. (See crb_acpi_add() and
>> > > > > > > > > > > > > tpm_crb_ffa_init() for reference.)
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
>> > > > > > > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
>> > > > > > > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > When this occurs, probing the TPM device is deferred.
>> > > > > > > > > > > > > However, the deferred probe can happen after the IMA subsystem
>> > > > > > > > > > > > > has already been initialized, since IMA initialization is performed
>> > > > > > > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
>> > > > > > > > > > > > > at the same level.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
>> > > > > > > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
>> > > > > > > > > > > > > log though TPM device presents in the system.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
>> > > > > > > > > > > >
>> > > > > > > > > > > > A lot of change for just detecting whether ima_init() is being called on
>> > > > > > > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
>> > > > > > > > > > > > changes (e.g. ima_init_core).
>> > > > > > > > > > > >
>> > > > > > > > > > > > Please just limit the change to just calling ima_init() twice.
>> > > > > > > > > > >
>> > > > > > > > > > > My concern is that ima_update_policy_flags() will be called
>> > > > > > > > > > > when ima_init() is deferred -- not initialised anything.
>> > > > > > > > > > > though functionally, it might be okay however,
>> > > > > > > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
>> > > > > > > > > > > works logically.
>> > > > > > > > > > >
>> > > > > > > > > > > This change I think not much quite a lot. just wrapper ima_init() with
>> > > > > > > > > > > ima_init_core() with some error handling.
>> > > > > > > > > > >
>> > > > > > > > > > > Am I missing something?
>> > > > > > > > > >
>> > > > > > > > > > Also, if we handle in ima_init() only, but it failed with other reason,
>> > > > > > > > > > we shouldn't call again ima_init() in the late_initcall_sync.
>> > > > > > > > > >
>> > > > > > > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
>> > > > > > > > > > it by caller of ima_init().
>> > > > > > > > >
>> > > > > > > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
>> > > > > > > > > instead of going into TPM-bypass mode, return immediately. There are no calls
>> > > > > > > > > to anything else. Just call ima_init() a second time.
>> > > > > > > >
>> > > > > > > > I’m not fully convinced this is sufficient.
>> > > > > > > >
>> > > > > > > > What I meant is the case where ima_init() fails due to other
>> > > > > > > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
>> > > > > > >
>> > > > > > > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
>> > > > > > > available at late_initcall. This would be classified as a bug fix and would be
>> > > > > > > backported. No other changes should be included in this patch.
>> > > > > >
>> > > > > > Okay.
>> > > > > >
>> > > > > > > >
>> > > > > > > > I’d also like to ask again whether it is fine to call
>> > > > > > > > ima_update_policy_flags() and keep the notifier registered in the
>> > > > > > > > deferred TPM case. While this may be functionally acceptable, it seems
>> > > > > > > > logically questionable to do so when ima_init() has not completed.
>> > > > > > >
>> > > > > > > Other than extending the TPM, IMA should behave exactly the same whether there
>> > > > > > > is a TPM or goes into TPM-bypass mode.
>> > > > > > >
>> > > > > > > >
>> > > > > > > > There is also a possibility that a deferred case ultimately fails (e.g.
>> > > > > > > > deferred at late_initcall, but then failing at late_initcall_sync
>> > > > > > > > for another reason, even while entering TPM bypass mode). In that case,
>> > > > > > > > it seems more appropriate to handle this state in the caller of
>> > > > > > > > ima_init(), rather than inside ima_init() itself.
>> > > > > > >
>> > > > > > > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
>> > > > > > > bypass mode. Please don't make any other changes to the existing IMA behavior
>> > > > > > > and hide it here behind the late_initcall_sync change.
>> > > > > >
>> > > > > > Okay. you're talking called ima_update_policy_flags() at late_initcall
>> > > > > > wouldn't be not a problem even in case of late_initcall_sync's ima_init()
>> > > > > > get failed with "TPM-bypass mode".
>> > > > > >
>> > > > > > I see then, I'll make a patch simpler then.
>> > > > >
>> > > > > But I think in case of below situation:
>> > > > > - late_initcall's first ima_init() is deferred.
>> > > > > - late_initcall_sync try again but failed and try again with
>> > > > > CONFIG_IMA_DEFAULT_HASH.
>> > > > >
>> > > > > I would like to sustain init_ima_core to reduce the same code repeat
>> > > > > in late_initcall_sync.
>> > > >
>> > > > I think what Mimi's proposing is:
>> > > >
>> > > > If we're in late_initcall, and the TPM isn't available, return
>> > > > immediately with an error (the EPROBE_DEFER?), don't do any init.
>> > > >
>> > > > If we're in late_initcall_sync, either we're already initialised, so do
>> > > > return and nothing, or run through the entire flow, even if the TPM
>> > > > isn't unavailable.
>> > > >
>> > > > So ima_init() just needs to know a) if it's in the sync or non-sync mode
>> > > > and b) for the sync mode, if we've already done the init at
>> > > > non-sync.
>> > >
>> > > Thanks, Jonathan. That is exactly what I'm suggesting. Any other changes
>> > > should not be included in this patch. Since Yeoreum is not hearing me, feel
>> > > free to post a patch.
>> >
>> > I see. so what you need to is this only
>> > If it looks good to you. I'll send it at v3.
>>
>> FWIW, I pulled the tpm_default_chip check out a level to account for the
>> extra init you mentioned, and have the following (completely untested or
>> compiled, but gives the approach):
>
>Thanks, Jonathan! It looks good. Similarly untested/compiled.
FWIW, it does compile.
>Emitting a message on failure to initialize IMA at late_initcall is good, but
>the attestation service won't know. Could you somehow differentiate between the
>late_initcall and late_initcall_sync boot_aggregate records?
Are you thinking "boot_aggregate" and "boot_aggregate_late" or similar
as the "filename" on the entries, just so it's clear when we did the
init in the log, or something else?
J.
--
/-\ | 101 things you can't have too much
|@/ Debian GNU/Linux Developer | of : 39 - silver bullets.
\- |
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-23 17:13 UTC (permalink / raw)
To: Jonathan McDowell
Cc: Yeoreum Yun, linux-security-module, linux-kernel, linux-integrity,
linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
will, noodles, sebastianene
In-Reply-To: <aepQwcY523aukAvw@earth.li>
On Thu, 2026-04-23 at 18:02 +0100, Jonathan McDowell wrote:
> > > > > >
> > > > > > I think what Mimi's proposing is:
> > > > > >
> > > > > > If we're in late_initcall, and the TPM isn't available, return
> > > > > > immediately with an error (the EPROBE_DEFER?), don't do any init.
> > > > > >
> > > > > > If we're in late_initcall_sync, either we're already initialised, so do
> > > > > > return and nothing, or run through the entire flow, even if the TPM
> > > > > > isn't unavailable.
> > > > > >
> > > > > > So ima_init() just needs to know a) if it's in the sync or non-sync mode
> > > > > > and b) for the sync mode, if we've already done the init at
> > > > > > non-sync.
> > > > >
> > > > > Thanks, Jonathan. That is exactly what I'm suggesting. Any other changes
> > > > > should not be included in this patch. Since Yeoreum is not hearing me, feel
> > > > > free to post a patch.
> > > >
> > > > I see. so what you need to is this only
> > > > If it looks good to you. I'll send it at v3.
> > >
> > > FWIW, I pulled the tpm_default_chip check out a level to account for the
> > > extra init you mentioned, and have the following (completely untested or
> > > compiled, but gives the approach):
> >
> > Thanks, Jonathan! It looks good. Similarly untested/compiled.
>
> FWIW, it does compile.
>
> > Emitting a message on failure to initialize IMA at late_initcall is good, but
> > the attestation service won't know. Could you somehow differentiate between the
> > late_initcall and late_initcall_sync boot_aggregate records?
>
> Are you thinking "boot_aggregate" and "boot_aggregate_late" or similar
> as the "filename" on the entries, just so it's clear when we did the
> init in the log, or something else?
Perfect!
Mimi
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Mimi Zohar @ 2026-04-23 18:01 UTC (permalink / raw)
To: Yeoreum Yun, Jonathan McDowell
Cc: linux-security-module, linux-kernel, linux-integrity,
linux-arm-kernel, kvmarm, paul, jmorris, serge, roberto.sassu,
dmitry.kasatkin, eric.snowberg, jarkko, jgg, sudeep.holla, maz,
oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas,
will, noodles, sebastianene
In-Reply-To: <aeotq8nPVu4wvEx5@e129823.arm.com>
On Thu, 2026-04-23 at 15:33 +0100, Yeoreum Yun wrote:
> Hi Jonathan,
>
> > * # Be careful, this email looks suspicious; * Out of Character: The sender is exhibiting a significant deviation from their usual behavior, this may indicate that their account has been compromised. Be extra cautious before opening links or attachments. *
> > On Thu, Apr 23, 2026 at 02:55:14PM +0100, Yeoreum Yun wrote:
> > > > On Thu, 2026-04-23 at 13:53 +0100, Jonathan McDowell wrote:
> > > > > On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
> > > > > > > > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > > > > > > > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > > Hi Mimi,
> > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > > > > > > > > > the TPM driver must be built as built-in and
> > > > > > > > > > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > > > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > > > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > > > > > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > > > > > > > > > the CRB interface is probed before IMA initialization,
> > > > > > > > > > > > > > the following conditions must be met:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1. The corresponding ffa_device must be registered,
> > > > > > > > > > > > > > which is done via ffa_init().
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
> > > > > > > > > > > > > > tpm_crb_ffa_init().
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
> > > > > > > > > > > > > > be probed successfully. (See crb_acpi_add() and
> > > > > > > > > > > > > > tpm_crb_ffa_init() for reference.)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > > > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > > > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > > > > > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > > > > > > > > > has already been initialized, since IMA initialization is performed
> > > > > > > > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > > > > > > > > > at the same level.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > > > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > > > > > > > > > log though TPM device presents in the system.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > > > > > > > > >
> > > > > > > > > > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > > > > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > > > > > > > > > changes (e.g. ima_init_core).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please just limit the change to just calling ima_init() twice.
> > > > > > > > > > > >
> > > > > > > > > > > > My concern is that ima_update_policy_flags() will be called
> > > > > > > > > > > > when ima_init() is deferred -- not initialised anything.
> > > > > > > > > > > > though functionally, it might be okay however,
> > > > > > > > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > > > > > > > > > works logically.
> > > > > > > > > > > >
> > > > > > > > > > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > > > > > > > > > ima_init_core() with some error handling.
> > > > > > > > > > > >
> > > > > > > > > > > > Am I missing something?
> > > > > > > > > > >
> > > > > > > > > > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > > > > > > > > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > > > > > > > > >
> > > > > > > > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > > > > > > > > > it by caller of ima_init().
> > > > > > > > > >
> > > > > > > > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
> > > > > > > > > > instead of going into TPM-bypass mode, return immediately. There are no calls
> > > > > > > > > > to anything else. Just call ima_init() a second time.
> > > > > > > > >
> > > > > > > > > I’m not fully convinced this is sufficient.
> > > > > > > > >
> > > > > > > > > What I meant is the case where ima_init() fails due to other
> > > > > > > > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
> > > > > > > >
> > > > > > > > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
> > > > > > > > available at late_initcall. This would be classified as a bug fix and would be
> > > > > > > > backported. No other changes should be included in this patch.
> > > > > > >
> > > > > > > Okay.
> > > > > > >
> > > > > > > > >
> > > > > > > > > I’d also like to ask again whether it is fine to call
> > > > > > > > > ima_update_policy_flags() and keep the notifier registered in the
> > > > > > > > > deferred TPM case. While this may be functionally acceptable, it seems
> > > > > > > > > logically questionable to do so when ima_init() has not completed.
> > > > > > > >
> > > > > > > > Other than extending the TPM, IMA should behave exactly the same whether there
> > > > > > > > is a TPM or goes into TPM-bypass mode.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > There is also a possibility that a deferred case ultimately fails (e.g.
> > > > > > > > > deferred at late_initcall, but then failing at late_initcall_sync
> > > > > > > > > for another reason, even while entering TPM bypass mode). In that case,
> > > > > > > > > it seems more appropriate to handle this state in the caller of
> > > > > > > > > ima_init(), rather than inside ima_init() itself.
> > > > > > > >
> > > > > > > > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
> > > > > > > > bypass mode. Please don't make any other changes to the existing IMA behavior
> > > > > > > > and hide it here behind the late_initcall_sync change.
> > > > > > >
> > > > > > > Okay. you're talking called ima_update_policy_flags() at late_initcall
> > > > > > > wouldn't be not a problem even in case of late_initcall_sync's ima_init()
> > > > > > > get failed with "TPM-bypass mode".
> > > > > > >
> > > > > > > I see then, I'll make a patch simpler then.
> > > > > >
> > > > > > But I think in case of below situation:
> > > > > > - late_initcall's first ima_init() is deferred.
> > > > > > - late_initcall_sync try again but failed and try again with
> > > > > > CONFIG_IMA_DEFAULT_HASH.
> > > > > >
> > > > > > I would like to sustain init_ima_core to reduce the same code repeat
> > > > > > in late_initcall_sync.
> > > > >
> > > > > I think what Mimi's proposing is:
> > > > >
> > > > > If we're in late_initcall, and the TPM isn't available, return
> > > > > immediately with an error (the EPROBE_DEFER?), don't do any init.
> > > > >
> > > > > If we're in late_initcall_sync, either we're already initialised, so do
> > > > > return and nothing, or run through the entire flow, even if the TPM
> > > > > isn't unavailable.
> > > > >
> > > > > So ima_init() just needs to know a) if it's in the sync or non-sync mode
> > > > > and b) for the sync mode, if we've already done the init at
> > > > > non-sync.
> > > >
> > > > Thanks, Jonathan. That is exactly what I'm suggesting. Any other changes
> > > > should not be included in this patch. Since Yeoreum is not hearing me, feel
> > > > free to post a patch.
> > >
> > > I see. so what you need to is this only
> > > If it looks good to you. I'll send it at v3.
> >
> > FWIW, I pulled the tpm_default_chip check out a level to account for the
> > extra init you mentioned, and have the following (completely untested or
> > compiled, but gives the approach):
> >
> > diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> > index d48bf0ad26f4..88fe105b7f00 100644
> > --- a/include/linux/lsm_hooks.h
> > +++ b/include/linux/lsm_hooks.h
> > @@ -166,6 +166,7 @@ enum lsm_order {
> > * @initcall_fs: LSM callback for fs_initcall setup, optional
> > * @initcall_device: LSM callback for device_initcall() setup, optional
> > * @initcall_late: LSM callback for late_initcall() setup, optional
> > + * @initcall_late_sync: LSM callback for late_initcall_sync() setup, optional
> > */
> > struct lsm_info {
> > const struct lsm_id *id;
> > @@ -181,6 +182,7 @@ struct lsm_info {
> > int (*initcall_fs)(void);
> > int (*initcall_device)(void);
> > int (*initcall_late)(void);
> > + int (*initcall_late_sync)(void);
> > };
> > #define DEFINE_LSM(lsm) \
> > diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
> > index a2f34f2d8ad7..a60dfb8316d8 100644
> > --- a/security/integrity/ima/ima_init.c
> > +++ b/security/integrity/ima/ima_init.c
> > @@ -119,10 +119,6 @@ int __init ima_init(void)
> > {
> > int rc;
> > - ima_tpm_chip = tpm_default_chip();
> > - if (!ima_tpm_chip)
> > - pr_info("No TPM chip found, activating TPM-bypass!\n");
> > -
> > rc = integrity_init_keyring(INTEGRITY_KEYRING_IMA);
> > if (rc)
> > return rc;
> > diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
> > index 1d6229b156fb..b60a85fa803a 100644
> > --- a/security/integrity/ima/ima_main.c
> > +++ b/security/integrity/ima/ima_main.c
> > @@ -1237,7 +1237,7 @@ static int ima_kernel_module_request(char *kmod_name)
> > #endif /* CONFIG_INTEGRITY_ASYMMETRIC_KEYS */
> > -static int __init init_ima(void)
> > +static int __init init_ima(bool sync)
> > {
> > int error;
> > @@ -1247,6 +1247,19 @@ static int __init init_ima(void)
> > return 0;
> > }
> > + /* If we found the TPM during our first attempt, nothing further to do */
> > + if (sync && ima_tpm_chip)
> > + return 0;
> > +
> > + ima_tpm_chip = tpm_default_chip();
> > + if (!ima_tpm_chip && !sync) {
> > + pr_debug("TPM not available, will try later\n");
> > + return -EPROBE_DEFER;
> > + }
> > +
> > + if (!ima_tpm_chip)
> > + pr_info("No TPM chip found, activating TPM-bypass!\n");
> > +
> > ima_appraise_parse_cmdline();
> > ima_init_template_list();
> > hash_setup(CONFIG_IMA_DEFAULT_HASH);
> > @@ -1274,6 +1287,16 @@ static int __init init_ima(void)
> > return error;
> > }
> > +static int __init init_ima_late(void)
> > +{
> > + return init_ima(false);
> > +}
> > +
> > +static int __init init_ima_late_sync(void)
> > +{
> > + return init_ima(true);
> > +}
> > +
> > static struct security_hook_list ima_hooks[] __ro_after_init = {
> > LSM_HOOK_INIT(bprm_check_security, ima_bprm_check),
> > LSM_HOOK_INIT(bprm_creds_for_exec, ima_bprm_creds_for_exec),
> > @@ -1319,6 +1342,7 @@ DEFINE_LSM(ima) = {
> > .init = init_ima_lsm,
> > .order = LSM_ORDER_LAST,
> > .blobs = &ima_blob_sizes,
> > - /* Start IMA after the TPM is available */
> > - .initcall_late = init_ima,
> > + /* Ensure we start IMA after the TPM is available */
> > + .initcall_late = init_ima_late,
> > + .initcall_late_sync = init_ima_late_sync,
> > };
> > diff --git a/security/lsm_init.c b/security/lsm_init.c
> > index 573e2a7250c4..4e5c59beb82a 100644
> > --- a/security/lsm_init.c
> > +++ b/security/lsm_init.c
> > @@ -547,13 +547,22 @@ device_initcall(security_initcall_device);
> > * security_initcall_late - Run the LSM late initcalls
> > */
> > static int __init security_initcall_late(void)
> > +{
> > + return lsm_initcall(late);
> > +}
> > +late_initcall(security_initcall_late);
> > +
> > +/**
> > + * security_initcall_late_sync - Run the LSM late initcalls sync
> > + */
> > +static int __init security_initcall_late_sync(void)
> > {
> > int rc;
> > - rc = lsm_initcall(late);
> > + rc = lsm_initcall(late_sync);
> > lsm_pr_dbg("all enabled LSMs fully activated\n");
> > call_blocking_lsm_notifier(LSM_STARTED_ALL, NULL);
> > return rc;
> > }
> > -late_initcall(security_initcall_late);
> > +late_initcall_sync(security_initcall_late_sync);
>
> I'm fine this. but are we talking about "ima_init()" not "init_ima()"?
Having two functions named ima_init() and init_ima() is really confusing. At
least with this patch, init_ima() will be replaced with init_ima_late() and
init_ima_sync().
> Because of this, I've fixuated and make a long stupid speaking myself.
The commit 0e0546eabcd6 ("firmware: arm_ffa: Change initcall level of ffa_init()
to rootfs_initcall") patch description was really well written. I'm really sad
that it needs to be reverted.
The TPM not being initialized before IMA, has been an issue for a really long
time. Hopefully this patch will safely fix it, not only for you, but for others
as well.
>
> If this seems good to Mimi, I don't care who send it.
> But If you're going to send this, could you includes 2 and 3 too?
Once this patch is ready, we can create a topic branch to coordinate upstreaming
the remaining patches.
thanks!
Mimi
^ permalink raw reply
* Re: [RFC PATCH v2 1/4] security: ima: call ima_init() again at late_initcall_sync for defered TPM
From: Yeoreum Yun @ 2026-04-23 18:13 UTC (permalink / raw)
To: Mimi Zohar
Cc: Jonathan McDowell, linux-security-module, linux-kernel,
linux-integrity, linux-arm-kernel, kvmarm, paul, jmorris, serge,
roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, jgg,
sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, noodles, sebastianene
In-Reply-To: <e4e242ae5533d5762a3647186a178764881bf9ff.camel@linux.ibm.com>
> On Thu, 2026-04-23 at 15:33 +0100, Yeoreum Yun wrote:
> > Hi Jonathan,
> >
> > > * # Be careful, this email looks suspicious; * Out of Character: The sender is exhibiting a significant deviation from their usual behavior, this may indicate that their account has been compromised. Be extra cautious before opening links or attachments. *
> > > On Thu, Apr 23, 2026 at 02:55:14PM +0100, Yeoreum Yun wrote:
> > > > > On Thu, 2026-04-23 at 13:53 +0100, Jonathan McDowell wrote:
> > > > > > On Thu, Apr 23, 2026 at 01:34:13PM +0100, Yeoreum Yun wrote:
> > > > > > > > > On Thu, 2026-04-23 at 06:55 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > On Wed, 2026-04-22 at 20:41 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > > > Hi Mimi,
> > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, 2026-04-22 at 17:24 +0100, Yeoreum Yun wrote:
> > > > > > > > > > > > > > > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > > > > > > > > > > > > > > the TPM driver must be built as built-in and
> > > > > > > > > > > > > > > must be probed before the IMA subsystem is initialized.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > However, when the TPM device operates over the FF-A protocol using
> > > > > > > > > > > > > > > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > > > > > > > > > > > > > > the tpm_crb_ffa device — an FF-A device that provides the communication
> > > > > > > > > > > > > > > interface to the tpm_crb driver — has not yet been probed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > To ensure the TPM device operating over the FF-A protocol with
> > > > > > > > > > > > > > > the CRB interface is probed before IMA initialization,
> > > > > > > > > > > > > > > the following conditions must be met:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. The corresponding ffa_device must be registered,
> > > > > > > > > > > > > > > which is done via ffa_init().
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. The tpm_crb_driver must successfully probe this device via
> > > > > > > > > > > > > > > tpm_crb_ffa_init().
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 3. The tpm_crb driver using CRB over FF-A can then
> > > > > > > > > > > > > > > be probed successfully. (See crb_acpi_add() and
> > > > > > > > > > > > > > > tpm_crb_ffa_init() for reference.)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > > > > > > > > > > > > > > all registered with device_initcall, which means crb_acpi_driver_init() may
> > > > > > > > > > > > > > > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > When this occurs, probing the TPM device is deferred.
> > > > > > > > > > > > > > > However, the deferred probe can happen after the IMA subsystem
> > > > > > > > > > > > > > > has already been initialized, since IMA initialization is performed
> > > > > > > > > > > > > > > during late_initcall, and deferred_probe_initcall() is performed
> > > > > > > > > > > > > > > at the same level.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > To resolve this, call ima_init() again at late_inicall_sync level
> > > > > > > > > > > > > > > so that let IMA not miss TPM PCR value when generating boot_aggregate
> > > > > > > > > > > > > > > log though TPM device presents in the system.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > A lot of change for just detecting whether ima_init() is being called on
> > > > > > > > > > > > > > late_initcall or late_initcall_sync(), without any explanation for all the other
> > > > > > > > > > > > > > changes (e.g. ima_init_core).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Please just limit the change to just calling ima_init() twice.
> > > > > > > > > > > > >
> > > > > > > > > > > > > My concern is that ima_update_policy_flags() will be called
> > > > > > > > > > > > > when ima_init() is deferred -- not initialised anything.
> > > > > > > > > > > > > though functionally, it might be okay however,
> > > > > > > > > > > > > I think ima_update_policy_flags() and notifier should work after ima_init()
> > > > > > > > > > > > > works logically.
> > > > > > > > > > > > >
> > > > > > > > > > > > > This change I think not much quite a lot. just wrapper ima_init() with
> > > > > > > > > > > > > ima_init_core() with some error handling.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Am I missing something?
> > > > > > > > > > > >
> > > > > > > > > > > > Also, if we handle in ima_init() only, but it failed with other reason,
> > > > > > > > > > > > we shouldn't call again ima_init() in the late_initcall_sync.
> > > > > > > > > > > >
> > > > > > > > > > > > To handle this, It wouldn't do in the ima_init() but we need to handle
> > > > > > > > > > > > it by caller of ima_init().
> > > > > > > > > > >
> > > > > > > > > > > Only tpm_default_chip() is being called to set the ima_tpm_chip. On failure,
> > > > > > > > > > > instead of going into TPM-bypass mode, return immediately. There are no calls
> > > > > > > > > > > to anything else. Just call ima_init() a second time.
> > > > > > > > > >
> > > > > > > > > > I’m not fully convinced this is sufficient.
> > > > > > > > > >
> > > > > > > > > > What I meant is the case where ima_init() fails due to other
> > > > > > > > > > initialisation steps, not only tpm_default_chip() (e.g. ima_fs_init()).
> > > > > > > > >
> > > > > > > > > The purpose of THIS patch is to add late_initcall_sync, when the TPM is not
> > > > > > > > > available at late_initcall. This would be classified as a bug fix and would be
> > > > > > > > > backported. No other changes should be included in this patch.
> > > > > > > >
> > > > > > > > Okay.
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I’d also like to ask again whether it is fine to call
> > > > > > > > > > ima_update_policy_flags() and keep the notifier registered in the
> > > > > > > > > > deferred TPM case. While this may be functionally acceptable, it seems
> > > > > > > > > > logically questionable to do so when ima_init() has not completed.
> > > > > > > > >
> > > > > > > > > Other than extending the TPM, IMA should behave exactly the same whether there
> > > > > > > > > is a TPM or goes into TPM-bypass mode.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > There is also a possibility that a deferred case ultimately fails (e.g.
> > > > > > > > > > deferred at late_initcall, but then failing at late_initcall_sync
> > > > > > > > > > for another reason, even while entering TPM bypass mode). In that case,
> > > > > > > > > > it seems more appropriate to handle this state in the caller of
> > > > > > > > > > ima_init(), rather than inside ima_init() itself.
> > > > > > > > >
> > > > > > > > > If the TPM isn't found at late_initcall_sync(), then IMA should go into TPM-
> > > > > > > > > bypass mode. Please don't make any other changes to the existing IMA behavior
> > > > > > > > > and hide it here behind the late_initcall_sync change.
> > > > > > > >
> > > > > > > > Okay. you're talking called ima_update_policy_flags() at late_initcall
> > > > > > > > wouldn't be not a problem even in case of late_initcall_sync's ima_init()
> > > > > > > > get failed with "TPM-bypass mode".
> > > > > > > >
> > > > > > > > I see then, I'll make a patch simpler then.
> > > > > > >
> > > > > > > But I think in case of below situation:
> > > > > > > - late_initcall's first ima_init() is deferred.
> > > > > > > - late_initcall_sync try again but failed and try again with
> > > > > > > CONFIG_IMA_DEFAULT_HASH.
> > > > > > >
> > > > > > > I would like to sustain init_ima_core to reduce the same code repeat
> > > > > > > in late_initcall_sync.
> > > > > >
> > > > > > I think what Mimi's proposing is:
> > > > > >
> > > > > > If we're in late_initcall, and the TPM isn't available, return
> > > > > > immediately with an error (the EPROBE_DEFER?), don't do any init.
> > > > > >
> > > > > > If we're in late_initcall_sync, either we're already initialised, so do
> > > > > > return and nothing, or run through the entire flow, even if the TPM
> > > > > > isn't unavailable.
> > > > > >
> > > > > > So ima_init() just needs to know a) if it's in the sync or non-sync mode
> > > > > > and b) for the sync mode, if we've already done the init at
> > > > > > non-sync.
> > > > >
> > > > > Thanks, Jonathan. That is exactly what I'm suggesting. Any other changes
> > > > > should not be included in this patch. Since Yeoreum is not hearing me, feel
> > > > > free to post a patch.
> > > >
> > > > I see. so what you need to is this only
> > > > If it looks good to you. I'll send it at v3.
> > >
> > > FWIW, I pulled the tpm_default_chip check out a level to account for the
> > > extra init you mentioned, and have the following (completely untested or
> > > compiled, but gives the approach):
> > >
> > > diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> > > index d48bf0ad26f4..88fe105b7f00 100644
> > > --- a/include/linux/lsm_hooks.h
> > > +++ b/include/linux/lsm_hooks.h
> > > @@ -166,6 +166,7 @@ enum lsm_order {
> > > * @initcall_fs: LSM callback for fs_initcall setup, optional
> > > * @initcall_device: LSM callback for device_initcall() setup, optional
> > > * @initcall_late: LSM callback for late_initcall() setup, optional
> > > + * @initcall_late_sync: LSM callback for late_initcall_sync() setup, optional
> > > */
> > > struct lsm_info {
> > > const struct lsm_id *id;
> > > @@ -181,6 +182,7 @@ struct lsm_info {
> > > int (*initcall_fs)(void);
> > > int (*initcall_device)(void);
> > > int (*initcall_late)(void);
> > > + int (*initcall_late_sync)(void);
> > > };
> > > #define DEFINE_LSM(lsm) \
> > > diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
> > > index a2f34f2d8ad7..a60dfb8316d8 100644
> > > --- a/security/integrity/ima/ima_init.c
> > > +++ b/security/integrity/ima/ima_init.c
> > > @@ -119,10 +119,6 @@ int __init ima_init(void)
> > > {
> > > int rc;
> > > - ima_tpm_chip = tpm_default_chip();
> > > - if (!ima_tpm_chip)
> > > - pr_info("No TPM chip found, activating TPM-bypass!\n");
> > > -
> > > rc = integrity_init_keyring(INTEGRITY_KEYRING_IMA);
> > > if (rc)
> > > return rc;
> > > diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
> > > index 1d6229b156fb..b60a85fa803a 100644
> > > --- a/security/integrity/ima/ima_main.c
> > > +++ b/security/integrity/ima/ima_main.c
> > > @@ -1237,7 +1237,7 @@ static int ima_kernel_module_request(char *kmod_name)
> > > #endif /* CONFIG_INTEGRITY_ASYMMETRIC_KEYS */
> > > -static int __init init_ima(void)
> > > +static int __init init_ima(bool sync)
> > > {
> > > int error;
> > > @@ -1247,6 +1247,19 @@ static int __init init_ima(void)
> > > return 0;
> > > }
> > > + /* If we found the TPM during our first attempt, nothing further to do */
> > > + if (sync && ima_tpm_chip)
> > > + return 0;
> > > +
> > > + ima_tpm_chip = tpm_default_chip();
> > > + if (!ima_tpm_chip && !sync) {
> > > + pr_debug("TPM not available, will try later\n");
> > > + return -EPROBE_DEFER;
> > > + }
> > > +
> > > + if (!ima_tpm_chip)
> > > + pr_info("No TPM chip found, activating TPM-bypass!\n");
> > > +
> > > ima_appraise_parse_cmdline();
> > > ima_init_template_list();
> > > hash_setup(CONFIG_IMA_DEFAULT_HASH);
> > > @@ -1274,6 +1287,16 @@ static int __init init_ima(void)
> > > return error;
> > > }
> > > +static int __init init_ima_late(void)
> > > +{
> > > + return init_ima(false);
> > > +}
> > > +
> > > +static int __init init_ima_late_sync(void)
> > > +{
> > > + return init_ima(true);
> > > +}
> > > +
> > > static struct security_hook_list ima_hooks[] __ro_after_init = {
> > > LSM_HOOK_INIT(bprm_check_security, ima_bprm_check),
> > > LSM_HOOK_INIT(bprm_creds_for_exec, ima_bprm_creds_for_exec),
> > > @@ -1319,6 +1342,7 @@ DEFINE_LSM(ima) = {
> > > .init = init_ima_lsm,
> > > .order = LSM_ORDER_LAST,
> > > .blobs = &ima_blob_sizes,
> > > - /* Start IMA after the TPM is available */
> > > - .initcall_late = init_ima,
> > > + /* Ensure we start IMA after the TPM is available */
> > > + .initcall_late = init_ima_late,
> > > + .initcall_late_sync = init_ima_late_sync,
> > > };
> > > diff --git a/security/lsm_init.c b/security/lsm_init.c
> > > index 573e2a7250c4..4e5c59beb82a 100644
> > > --- a/security/lsm_init.c
> > > +++ b/security/lsm_init.c
> > > @@ -547,13 +547,22 @@ device_initcall(security_initcall_device);
> > > * security_initcall_late - Run the LSM late initcalls
> > > */
> > > static int __init security_initcall_late(void)
> > > +{
> > > + return lsm_initcall(late);
> > > +}
> > > +late_initcall(security_initcall_late);
> > > +
> > > +/**
> > > + * security_initcall_late_sync - Run the LSM late initcalls sync
> > > + */
> > > +static int __init security_initcall_late_sync(void)
> > > {
> > > int rc;
> > > - rc = lsm_initcall(late);
> > > + rc = lsm_initcall(late_sync);
> > > lsm_pr_dbg("all enabled LSMs fully activated\n");
> > > call_blocking_lsm_notifier(LSM_STARTED_ALL, NULL);
> > > return rc;
> > > }
> > > -late_initcall(security_initcall_late);
> > > +late_initcall_sync(security_initcall_late_sync);
> >
> > I'm fine this. but are we talking about "ima_init()" not "init_ima()"?
>
> Having two functions named ima_init() and init_ima() is really confusing. At
> least with this patch, init_ima() will be replaced with init_ima_late() and
> init_ima_sync().
>
> > Because of this, I've fixuated and make a long stupid speaking myself.
>
> The commit 0e0546eabcd6 ("firmware: arm_ffa: Change initcall level of ffa_init()
> to rootfs_initcall") patch description was really well written. I'm really sad
> that it needs to be reverted.
>
> The TPM not being initialized before IMA, has been an issue for a really long
> time. Hopefully this patch will safely fix it, not only for you, but for others
> as well.
>
> >
> > If this seems good to Mimi, I don't care who send it.
> > But If you're going to send this, could you includes 2 and 3 too?
>
> Once this patch is ready, we can create a topic branch to coordinate upstreaming
> the remaining patches.
Sounds good. Once the patch is posted, I’ll review it as well.
Sorry again for the noise, and thanks for your patience ;)
--
Sincerely,
Yeoreum Yun
^ permalink raw reply
* Re: [PATCH v5 6/10] security: Hornet LSM
From: Paul Moore @ 2026-04-23 18:37 UTC (permalink / raw)
To: Blaise Boscaccy, Blaise Boscaccy, Jonathan Corbet, James Morris,
Serge E. Hallyn, Mickaël Salaün, Günther Noack,
Dr. David Alan Gilbert, Andrew Morton, James.Bottomley, dhowells,
Fan Wu, Ryan Foster, Randy Dunlap, linux-security-module,
linux-doc, linux-kernel, bpf, Song Liu
In-Reply-To: <20260420212653.438685-7-bboscaccy@linux.microsoft.com>
On Apr 20, 2026 Blaise Boscaccy <bboscaccy@linux.microsoft.com> wrote:
>
> This adds the Hornet Linux Security Module which provides enhanced
> signature verification and data validation for eBPF programs. This
> allows users to continue to maintain an invariant that all code
> running inside of the kernel has actually been signed and verified, by
> the kernel.
>
> This effort builds upon the currently excepted upstream solution. It
> further hardens it by providing deterministic, in-kernel checking of
> map hashes to solidify auditing along with preventing TOCTOU attacks
> against lskel map hashes.
>
> Target map hashes are passed in via PKCS#7 signed attributes. Hornet
> determines the extent which the eBFP program is signed and defers to
> other LSMs for policy decisions.
>
> Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
> Nacked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> ---
> Documentation/admin-guide/LSM/Hornet.rst | 321 +++++++++++++++++++++
> Documentation/admin-guide/LSM/index.rst | 1 +
> MAINTAINERS | 9 +
> include/linux/oid_registry.h | 3 +
> include/uapi/linux/lsm.h | 1 +
> security/Kconfig | 3 +-
> security/Makefile | 1 +
> security/hornet/Kconfig | 11 +
> security/hornet/Makefile | 7 +
> security/hornet/hornet.asn1 | 13 +
> security/hornet/hornet_lsm.c | 346 +++++++++++++++++++++++
> 11 files changed, 715 insertions(+), 1 deletion(-)
> create mode 100644 Documentation/admin-guide/LSM/Hornet.rst
> create mode 100644 security/hornet/Kconfig
> create mode 100644 security/hornet/Makefile
> create mode 100644 security/hornet/hornet.asn1
> create mode 100644 security/hornet/hornet_lsm.c
While I think this is looking pretty reasonable, I think Fan had some
feedback which merits a reply. I also spotted some references to the
secondary keyring in the docs which need to be updated (below).
> diff --git a/Documentation/admin-guide/LSM/Hornet.rst b/Documentation/admin-guide/LSM/Hornet.rst
> new file mode 100644
> index 0000000000000..af5e9cd9d83a8
> --- /dev/null
> +++ b/Documentation/admin-guide/LSM/Hornet.rst
> @@ -0,0 +1,321 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +======
> +Hornet
> +======
> +
> +Hornet is a Linux Security Module that provides extensible signature
> +verification for eBPF programs. This is selectable at build-time with
> +``CONFIG_SECURITY_HORNET``.
> +
> +Overview
> +========
> +
> +Hornet addresses concerns from users who require strict audit trails and
> +verification guarantees for eBPF programs, especially in
> +security-sensitive environments. Many production systems need assurance
> +that only authorized, unmodified eBPF programs are loaded into the
> +kernel. Hornet provides this assurance through cryptographic signature
> +verification.
> +
> +When an eBPF program is loaded via the ``bpf()`` syscall, Hornet
> +verifies a PKCS#7 signature attached to the program instructions. The
> +signature is checked against the kernel's secondary keyring using the
This version now supports using the keyring specified in the bpf_attr
union (presumably for maximum compatibility with KP's signature scheme),
which is good, but the docs need to be updated.
See below, but I would probably make a note that LSMs providing
enforcement of BPF signatures will likely want to check what keyring was
used to verify the signature, e.g. a trusted keyring vs a user supplied
keyring.
> +existing kernel cryptographic infrastructure. In addition to signing the
> +program bytecode, Hornet supports signing SHA-256 hashes of associated
> +BPF maps, enabling integrity verification of map contents at load time
> +and at runtime.
> +
> +After verification, Hornet classifies the program into one of the
> +following integrity states and passes the result to a downstream LSM hook
> +(``bpf_prog_load_post_integrity``), allowing other security modules to
> +make policy decisions based on the verification outcome:
> +
> +``LSM_INT_VERDICT_OK``
> + The program signature and all map hashes verified successfully.
> +
> +``LSM_INT_VERDICT_UNSIGNED``
> + No signature was provided with the program.
> +
> +``LSM_INT_VERDICT_PARTIALSIG``
> + The program signature verified, but the signature did not contain
> + hornet map hash data.
> +
> +``LSM_INT_VERDICT_UNKNOWNKEY``
> + The signing certificate is not trusted in the secondary keyring,
Another secondary keyring mention.
> +``LSM_INT_VERDICT_FAULT``
> + A system error occured during verification.
> +
> +``LSM_INT_VERDICT_UNEXPECTED``
> + An unexpected map hash value was encountered.
> +
> +``LSM_INT_VERDICT_BADSIG``
> + The signature or a map hash failed verification.
> +
> +Hornet itself does not enforce a policy on whether unsigned or partially
> +signed programs should be rejected. It delegates that decision to
> +downstream LSMs via the ``bpf_prog_load_post_integrity`` hook, making it
> +a composable building block in a larger security architecture.
This might be a good place to document that in addition to the verdicts
described above, LSMs providing enforcement should also consider the
keyring used for verification.
> +Known Limitations
> +=================
> +
> +- Hornet requires programs to use :doc:`light skeletons
> + </bpf/libbpf/libbpf_naming_convention>` (lskels) for the signing
> + workflow, as the tooling operates on lskel-generated headers.
> +
> +- A maximum of 64 maps per program can be tracked for hash
> + verification.
> +
> +- Map hash verification requires the maps to be frozen before loading.
> + Maps that are not frozen at load time will cause verification to fail
> + when their hashes are included in the signature.
> +
> +- Hornet relies on the kernel's secondary keyring
> + (``VERIFY_USE_SECONDARY_KEYRING``) for certificate trust. Keys must
> + be provisioned into this keyring before programs can be verified.
... another spot.
> +- The only hashing algorithm available is SHA256 due to it be hardcoded
> + in the bpf subsystem.
...
> +Signature Verification Flow
> +---------------------------
> +
> +The following describes what happens when a userspace program calls
> +``bpf(BPF_PROG_LOAD, ...)`` with a signature attached:
> +
> +1. The ``bpf_prog_load_integrity`` LSM hook is invoked.
> +
> +2. Hornet reads the signature from the userspace buffer specified by
> + ``attr->signature`` (with length ``attr->signature_size``).
> +
> +3. The PKCS#7 signature is verified against the program instructions
> + using ``verify_pkcs7_signature()`` with the kernel's secondary
> + keyring.
I believe this is the last mention of the secondary keyring.
> +4. The PKCS#7 message is parsed and its trust chain is validated via
> + ``validate_pkcs7_trust()``.
> +
> +5. Hornet extracts the authenticated attribute identified by
> + ``OID_hornet_data`` (OID ``2.25.316487325684022475439036912669789383960``)
> + from the PKCS#7 message. This attribute contains an ASN.1-encoded set
> + of map index/hash pairs.
> +
> +6. For each map hash entry, Hornet retrieves the corresponding BPF map
> + via its file descriptor, confirms it is frozen, computes its SHA-256
> + hash, and compares it against the signed hash.
> +
> +7. The resulting integrity verdict is passed to the
> + ``bpf_prog_load_post_integrity`` hook so that downstream LSMs can
> + enforce policy.
--
paul-moore.com
^ permalink raw reply
* [PATCH v3] evm: terminate and bound the evm_xattrs read buffer
From: Pengpeng Hou @ 2026-04-23 15:30 UTC (permalink / raw)
To: Mimi Zohar, Roberto Sassu
Cc: Roberto Sassu, Dmitry Kasatkin, Eric Snowberg, Paul Moore,
James Morris, Serge Hallyn, linux-integrity,
linux-security-module, linux-kernel, pengpeng
In-Reply-To: <20260417223004.1-evm-xattrs-v2-pengpeng@iscas.ac.cn>
evm_read_xattrs() allocates size + 1 bytes, fills them from the list of
enabled xattrs, and then passes strlen(temp) to
simple_read_from_buffer(). When no configured xattrs are enabled, the
fill loop stores nothing and temp[0] remains uninitialized, so strlen()
reads beyond initialized memory.
Explicitly terminate the buffer after allocation, use snprintf() for
each formatted line, and pass the accumulated length, without risk of
truncation, to simple_read_from_buffer().
Fixes: fa516b66a1bf ("EVM: Allow runtime modification of the set of verified xattrs")
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
---
Changes since v2:
- adjust the changelog wording to mention why the accumulated length is
safe
- add the blank line after the allocation error path
- add a comment explaining why snprintf() cannot truncate in the fill loop
diff --git a/security/integrity/evm/evm_secfs.c b/security/integrity/evm/evm_secfs.c
index acd840461902..4baf5e23bc97 100644
--- a/security/integrity/evm/evm_secfs.c
+++ b/security/integrity/evm/evm_secfs.c
@@ -127,8 +127,8 @@ static ssize_t evm_read_xattrs(struct file *filp, char __user *buf,
size_t count, loff_t *ppos)
{
char *temp;
- int offset = 0;
- ssize_t rc, size = 0;
+ size_t offset = 0, size = 0;
+ ssize_t rc;
struct xattr_list *xattr;
if (*ppos != 0)
@@ -151,16 +151,22 @@ static ssize_t evm_read_xattrs(struct file *filp, char __user *buf,
return -ENOMEM;
}
+ temp[size] = '\0';
+
+ /*
+ * No truncation possible: size is computed over the same enabled
+ * xattrs under xattr_list_mutex, so offset never exceeds size.
+ */
list_for_each_entry(xattr, &evm_config_xattrnames, list) {
if (!xattr->enabled)
continue;
- sprintf(temp + offset, "%s\n", xattr->name);
- offset += strlen(xattr->name) + 1;
+ offset += snprintf(temp + offset, size + 1 - offset, "%s\n",
+ xattr->name);
}
mutex_unlock(&xattr_list_mutex);
- rc = simple_read_from_buffer(buf, count, ppos, temp, strlen(temp));
+ rc = simple_read_from_buffer(buf, count, ppos, temp, offset);
kfree(temp);
--
2.50.1 (Apple Git-155)
^ permalink raw reply related
* Re: [PATCH] tomoyo: reject short exec.envp[] names before suffix checks
From: Pengpeng Hou @ 2026-04-23 22:53 UTC (permalink / raw)
To: Tetsuo Handa
Cc: Kentaro Takeda, Paul Moore, James Morris, Serge Hallyn,
linux-security-module, linux-kernel, pengpeng
In-Reply-To: <20260417073249.93906-1-pengpeng@iscas.ac.cn>
Hi Tetsuo,
Thanks for the explanation.
Agreed, I missed that the left-hand string is already guaranteed to be
safely dereferenced at that call site. I'll drop this patch.
Thanks,
Pengpeng
^ permalink raw reply
* [GIT PULL] AppArmor updates for 7.1
From: John Johansen @ 2026-04-23 23:53 UTC (permalink / raw)
To: Linus Torvalds; +Cc: open list:SECURITY SUBSYSTEM, LKLM
Hi Linus,
We ran into some issue with what was going to go in this cycles PR,
and I am still not happy with the revisions. So I am just going with
some bug fixes and 3 cleanups that have been queued most of the cycle.
This has been merge, built and regression tested, against your tree
from a couple of days ago.
There isn't any thing all that exciting in here, separating out the
changes.
+ Cleanups
- Use sysfs_emit in param_get_{audit,mode}
- Remove redundant if check in sk_peer_get_label
- Replace memcpy + NUL termination with kmemdup_nul in do_setattr
+ Bug Fixes
- Fix aa_dfa_unpack's error handling in aa_setup_dfa_engine
- Fix string overrun due to missing termination
- Fix wrong dentry in RENAME_EXCHANGE uid check
- fix unpack_tags to properly return error in failure cases
- fix dfa size check
- return error on namespace mismatch in verify_header
- use target task's context in apparmor_getprocattr()
Pull at your convenience -rc2 even later is fine, and if you want I
can just send the bug fixes up during the -rc2 window.
thanks
- john
The following changes since commit 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f:
Linux 7.0-rc1 (2026-02-22 13:18:59 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor tags/apparmor-pr-2026-04-23
for you to fetch changes up to 11b7df0952663f20ce72c9a22a3cf9278cf84db7:
apparmor/lsm: Fix aa_dfa_unpack's error handling in aa_setup_dfa_engine (2026-04-22 20:11:08 -0700)
----------------------------------------------------------------
+ Cleanups
- Use sysfs_emit in param_get_{audit,mode}
- Remove redundant if check in sk_peer_get_label
- Replace memcpy + NUL termination with kmemdup_nul in do_setattr
+ Bug Fixes
- Fix aa_dfa_unpack's error handling in aa_setup_dfa_engine
- Fix string overrun due to missing termination
- Fix wrong dentry in RENAME_EXCHANGE uid check
- fix unpack_tags to properly return error in failure cases
- fix dfa size check
- return error on namespace mismatch in verify_header
- use target task's context in apparmor_getprocattr()
----------------------------------------------------------------
Cengiz Can (1):
apparmor: use target task's context in apparmor_getprocattr()
Daniel J Blueman (1):
apparmor: Fix string overrun due to missing termination
Dudu Lu (1):
apparmor: Fix wrong dentry in RENAME_EXCHANGE uid check
GONG Ruiqi (1):
apparmor/lsm: Fix aa_dfa_unpack's error handling in aa_setup_dfa_engine
John Johansen (2):
apparmor: fix dfa size check
apparmor: fix unpack_tags to properly return error in failure cases
Massimiliano Pellizzer (1):
apparmor: return error on namespace mismatch in verify_header
Thorsten Blum (3):
apparmor: Replace memcpy + NUL termination with kmemdup_nul in do_setattr
apparmor: Remove redundant if check in sk_peer_get_label
apparmor: Use sysfs_emit in param_get_{audit,mode}
security/apparmor/lsm.c | 36 ++++++++++++++----------------------
security/apparmor/match.c | 2 +-
security/apparmor/path.c | 8 +++++---
security/apparmor/policy_unpack.c | 2 ++
4 files changed, 22 insertions(+), 26 deletions(-)
^ permalink raw reply
* Re: [PATCH RFC 1/3] LSM: add a flags field to the LSM hook definitions
From: Paul Moore @ 2026-04-24 1:19 UTC (permalink / raw)
To: Casey Schaufler, casey, linux-security-module
Cc: jmorris, serge, keescook, john.johansen, penguin-kernel,
stephen.smalley.work, selinux
In-Reply-To: <20260225192143.14448-2-casey@schaufler-ca.com>
On Feb 25, 2026 Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> Add a field for flags to the definition of LSM hooks. This allows
> for hooks to be identified at system initialization for special
> processing.
>
> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
> ---
> include/linux/bpf_lsm.h | 2 +-
> include/linux/lsm_hook_defs.h | 614 ++++++++++++++++++----------------
> include/linux/lsm_hooks.h | 4 +-
> kernel/bpf/bpf_lsm.c | 10 +-
> security/bpf/hooks.c | 2 +-
> security/security.c | 6 +-
> 6 files changed, 331 insertions(+), 307 deletions(-)
>
> diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> index 643809cc78c3..d71ba8c87e79 100644
> --- a/include/linux/bpf_lsm.h
> +++ b/include/linux/bpf_lsm.h
> @@ -14,7 +14,7 @@
>
> #ifdef CONFIG_BPF_LSM
>
> -#define LSM_HOOK(RET, DEFAULT, NAME, ...) \
> +#define LSM_HOOK(RET, DEFAULT, FLAGS, NAME, ...) \
> RET bpf_lsm_##NAME(__VA_ARGS__);
> #include <linux/lsm_hook_defs.h>
> #undef LSM_HOOK
> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> index 8c42b4bde09c..acda3a02da97 100644
> --- a/include/linux/lsm_hook_defs.h
> +++ b/include/linux/lsm_hook_defs.h
> @@ -18,451 +18,475 @@
> * The macro LSM_HOOK is used to define the data structures required by
> * the LSM framework using the pattern:
> *
> - * LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
> + * LSM_HOOK(<return_type>, <default_value>, <flags>, <single>,
> + * <hook_name>, args...)
> *
> * struct security_hook_heads {
> - * #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
> + * #define LSM_HOOK(RET, DEFAULT, FLAGS, NAME, ...) struct hlist_head NAME;
> * #include <linux/lsm_hook_defs.h>
> * #undef LSM_HOOK
> * };
> */
> -LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr)
> -LSM_HOOK(int, 0, binder_transaction, const struct cred *from,
> +LSM_HOOK(int, 0, 0, binder_set_context_mgr, const struct cred *mgr)
> +LSM_HOOK(int, 0, 0, binder_transaction, const struct cred *from,
> const struct cred *to)
I think adding a flag field to the LSM_HOOK() macro/definitions is a good
and useful addition, but I'd prefer if we created a LSM_FLAG_NONE #define
and used it here just so we could avoid the back-to-back 0's and do a bit
of self-documentation.
--
paul-moore.com
^ permalink raw reply
* Re: [PATCH RFC 2/3] LSM: Enforce exclusive hooks
From: Paul Moore @ 2026-04-24 1:19 UTC (permalink / raw)
To: Casey Schaufler, casey, linux-security-module
Cc: jmorris, serge, keescook, john.johansen, penguin-kernel,
stephen.smalley.work, selinux
In-Reply-To: <20260225192143.14448-3-casey@schaufler-ca.com>
On Feb 25, 2026 Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> If an LSM hook is marked as exclusive via LSM_FLAG_EXCLUSIVE
> in lsm_hook_defs.h it will not be added to the set of hooks to
> be executed if an different LSM has already registered an
> exclusive hook.
>
> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
> ---
> include/linux/security.h | 2 ++
> security/lsm_init.c | 66 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 68 insertions(+)
>
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 83a646d72f6f..e3c137a1b30a 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -2404,4 +2404,6 @@ static inline void security_initramfs_populated(void)
> }
> #endif /* CONFIG_SECURITY */
>
> +extern u64 lsm_exclusive_hooks;
We already have the 'lsm_exclusive' variable in lsm_init.c, don't create
another variable that does the same thing. If the scope of the existing
variable isn't what you need, change that.
Although to be honest, I'm not in love with what you're doing with the
variable anyway, more on that later in the review.
> #endif /* ! __LINUX_SECURITY_H */
> diff --git a/security/lsm_init.c b/security/lsm_init.c
> index 05bd52e6b1f2..dc3c84387a7e 100644
> --- a/security/lsm_init.c
> +++ b/security/lsm_init.c
> @@ -356,6 +356,70 @@ static int __init lsm_static_call_init(struct security_hook_list *hl)
> return -ENOSPC;
> }
>
> +/*
> + * Hooks that are restricted to use by a single security module.
> + *
> + * Secmark hooks have not been converted from secids to lsm_props
> + * due to space limitations in packet headers.
If this is a general purpose mechanism for all types of LSM hooks, please
don't put commentary in here about a single class of hook.
If this only reason for doing all of this is for secmark, just fix
secmark instead of going through all of this trouble.
> + * Conversions from a secid to a secctx are restricted to the
> + * single security module. All cases where there may be multiple
> + * modules providing the input data have been converted to use
> + * a lsm_prop instead of a secid.
Okay, yes, the paragraph above is true, I'm just not sure why it is
important here?
> + */
> +struct lsm_exclusive {
> + struct lsm_static_call *name;
> + char *namestr;
> + u32 flags;
> +};
> +
> +static __initdata struct lsm_exclusive lsm_exclusive_set[] = {
> +#define LSM_HOOK(RET, DEFAULT, FLAGS, NAME, ...) \
> + { .name = static_calls_table.NAME, .flags = FLAGS, .namestr = "" #NAME "" , },
> +#include <linux/lsm_hook_defs.h>
> +#undef LSM_HOOK
> +};
> +u64 lsm_exclusive_hooks;
> +EXPORT_SYMBOL(lsm_exclusive_hooks);
Unless I missed something, we really shouldn't need to export this, why
did you need EXPORT_SYMBOL() here?
> +/**
> + * lsm_exclusive_hook_denial - Check if exclusive hook is in use
> + * @hook: the hook to check
> + *
> + * Check if the hook in question is restricted to a single using LSM,
> + * and if the LSM providing single LSM hooks is defined.
> + *
> + * Returns true if the hook is exclusive and they are already provided,
> + * false otherwise.
> + */
> +static bool __init lsm_exclusive_hook_denial(struct security_hook_list *hook)
> +{
> + int i;
> +
> + if (lsm_exclusive_hooks == hook->lsmid->id)
> + return false;
> +
> + for (i = 0; i < ARRAY_SIZE(lsm_exclusive_set); i++) {
> + if (!(lsm_exclusive_set[i].flags & LSM_FLAG_EXCLUSIVE))
> + continue;
The logic on this looks a bit odd. What is wrong with something like the
pseduo code below?
for (i = 0; ARRAY_SIZE(lsm_hooks); i++) {
if (lsm_hooks[i] == hook) {
if (lsm_hooks[i].flags & LSM_FLAG_EXCLUSIVE)
return true;
else
return false;
}
}
> + if (hook->scalls == lsm_exclusive_set[i].name) {
> + if (lsm_exclusive_hooks) {
> + if (lsm_debug)
> + lsm_pr("%s denied for %s.\n",
> + lsm_exclusive_set[i].namestr,
> + hook->lsmid->name);
The lsm_pr_dbg() macro exists for this very reason.
> + return true;
> + }
> + if (lsm_debug)
> + lsm_pr("Exclusive hooks limited to %s.\n",
> + hook->lsmid->name);
Same as above.
> + lsm_exclusive_hooks = hook->lsmid->id;
> + break;
> + }
> + }
> + return false;
> +}
>
> /**
> * security_add_hooks - Add a LSM's hooks to the LSM framework's hook lists
> * @hooks: LSM hooks to add
> @@ -371,6 +435,8 @@ void __init security_add_hooks(struct security_hook_list *hooks, int count,
>
> for (i = 0; i < count; i++) {
> hooks[i].lsmid = lsmid;
> + if (lsm_exclusive_hook_denial(&hooks[i]))
> + continue;
> if (lsm_static_call_init(&hooks[i]))
> panic("exhausted LSM callback slots with LSM %s\n",
> lsmid->name);
I don't think we'd want to simply skip over a hook registration if the
LSM doesn't have access to an exclusive hook. At the very least it risks
unexpected behavior in the LSM and at the worst it prevents the LSM from
properly enforcing it's security policy. There is a reason we have the
panic() call in the existing code if we are not able to register a hook.
The simpliest solution here would be to panic, like we do today.
However, if you want to get fancy and enable LSMs to optionally adjust
their behavior to cope with the loss of a hook callback (I'm looking at
your patch 3/3 now), come up with a mechanism to report back to the LSM
that one or more of the callbacks were not registered. One quick thought
would be to return a non-zero error code and add an indicator/bool/flag
to the security_hook_list that would be set by security_add_hooks().
--
paul-moore.com
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox