Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [PATCH v4 3/3] tpm: tpm_crb_ffa: revert defered_probed when tpm_crb_ffa is built-in
From: Jarkko Sakkinen @ 2026-06-02  1:45 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: linux-security-module, linux-kernel, linux-integrity, paul, zohar,
	roberto.sassu, noodles, sudeep.holla, jmorris, serge,
	dmitry.kasatkin, eric.snowberg, jgg
In-Reply-To: <ah0x+YDypYFzpFqt@e129823.arm.com>

On Mon, Jun 01, 2026 at 08:17:13AM +0100, Yeoreum Yun wrote:
> Hi Jarkko,
> 
> Sorry for late answer.

it's all good, there's been a bug storm so I'm glad that you have not been
around ;-)

BR, Jarkko

> 
> > On Mon, May 25, 2026 at 08:54:04AM +0100, Yeoreum Yun wrote:
> > > commit 746d9e9f62a6 ("tpm: tpm_crb_ffa: try to probe tpm_crb_ffa when it's build_in")
> > > probe tpm_crb_ffa forcefully when it's built-in to integrate with IMA.
> > > 
> > > However, IMA now provides the IMA_INIT_LATE_SYNC build option, which
> > > initialises IMA at the late_initcall_sync level, so this change is no
> > > longer required.
> > > 
> > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > ---
> > >  drivers/char/tpm/tpm_crb_ffa.c | 18 +++---------------
> > >  1 file changed, 3 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/char/tpm/tpm_crb_ffa.c b/drivers/char/tpm/tpm_crb_ffa.c
> > > index 99f1c1e5644b..025c4d4b17ca 100644
> > > --- a/drivers/char/tpm/tpm_crb_ffa.c
> > > +++ b/drivers/char/tpm/tpm_crb_ffa.c
> > > @@ -177,23 +177,13 @@ static int tpm_crb_ffa_to_linux_errno(int errno)
> > >   */
> > >  int tpm_crb_ffa_init(void)
> > >  {
> > > -	int ret = 0;
> > > -
> > > -	if (!IS_MODULE(CONFIG_TCG_ARM_CRB_FFA)) {
> > > -		ret = ffa_register(&tpm_crb_ffa_driver);
> > > -		if (ret) {
> > > -			tpm_crb_ffa = ERR_PTR(-ENODEV);
> > > -			return ret;
> > > -		}
> > > -	}
> > > -
> > >  	if (!tpm_crb_ffa)
> > > -		ret = -ENOENT;
> > > +		return -ENOENT;
> > >  
> > >  	if (IS_ERR_VALUE(tpm_crb_ffa))
> > > -		ret = -ENODEV;
> > > +		return -ENODEV;
> > >  
> > > -	return ret;
> > > +	return 0;
> > >  }
> > >  EXPORT_SYMBOL_GPL(tpm_crb_ffa_init);
> > >  
> > > @@ -405,9 +395,7 @@ static struct ffa_driver tpm_crb_ffa_driver = {
> > >  	.id_table = tpm_crb_ffa_device_id,
> > >  };
> > >  
> > > -#ifdef MODULE
> > >  module_ffa_driver(tpm_crb_ffa_driver);
> > > -#endif
> > >  
> > >  MODULE_AUTHOR("Arm");
> > >  MODULE_DESCRIPTION("TPM CRB FFA driver");
> > > -- 
> > > LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
> > > 
> > 
> > How we would sync up this patch? Through which tree etc.
> 
> IMHO, the IMA relevant thing would be into IMA tree,
> However I think this patch would be much easier to sync into Sudeep's
> FF-A tree where ff-a initilisation is reverted to device_initcall
> unless you're uncomfortable.
> 
> For this, It might be better to split this patch from this series
> since by above and defer probe of ff-a would make a register failure
> of registering tpm_crb_ffa driver which is built-in.
> 
> @Sudeep what do you think?
> 
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux.git/commit/?h=for-next/ffa/updates&id=cc7e8f21b9f0c229d68cf19a837cba82b5ac2d87 [0]
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux.git/commit/?h=for-next/ffa/updates&id=e659fc8e537c7a21d5d693d6f30d8852f2fa8d91 [1]
> 
> -- 
> Sincerely,
> Yeoreum Yun

^ permalink raw reply

* Landstrip
From: Jarkko Sakkinen @ 2026-06-02  1:42 UTC (permalink / raw)
  To: linux-integrity; +Cc: linux-security-module

I played with an idea could Landlock LSM be used to do conceptually a
better fit sandbox for programs such as Anthropic Sandbox Runtime [1].

After some missteps at first I got it pulled together quite well:

https://crates.io/crates/landstrip

To see it in action I also have a fork of pi-hashline-readmap plugin,
which was a cherry-picked test case I wanted to try out given it already
hooks the bash tool command for compressed output.

I just thought that this might interest some as Landlock is not really
over-used kernel feature in "application sense".

This is a more lower barrier and more failure tolerant to deploy than
Bubblewrap based container for this use and purpose in my opinion
at least.

[1] https://github.com/anthropic-experimental/sandbox-runtime/issues/291
[2] https://github.com/jarkkojs/pi-hashline-readmap

BR, Jarkko

^ permalink raw reply

* Re: [PATCH v5 11/13] ima: Support staging and deleting N measurements entries
From: steven chen @ 2026-06-01 23:28 UTC (permalink / raw)
  To: Mimi Zohar, Roberto Sassu, corbet, skhan, dmitry.kasatkin,
	eric.snowberg, paul, jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, nramas, Roberto Sassu, steven chen
In-Reply-To: <f00aabe05aeee7f6fd0426fd992839758d810da7.camel@linux.ibm.com>

On 5/26/2026 4:08 AM, Mimi Zohar wrote:
> On Wed, 2026-04-29 at 18:03 +0200, Roberto Sassu wrote:
>> From: Roberto Sassu <roberto.sassu@huawei.com>
>>
>> Add support for sending a value N between 1 and ULONG_MAX to the IMA
>> original measurement interface. This value represents the number of
>> measurements that should be deleted from the current measurements list. In
>> this case, measurements are staged in an internal non-user visible list,
>> and immediately deleted.
>>
>> This staging method allows the remote attestation agents to easily separate
>> the measurements that were verified (staged and deleted) from those that
>> weren't due to the race between taking a TPM quote and reading the
>> measurements list.
> The reason for removing records from the IMA measurement list is to free kernel
> memory.  However, the level of precision in removing only those measurements
> needed for the quote seems necessary only if the measurement records are not
> being saved.  Upstreaming a feature to remove measurement records from the IMA
> measurement list is to address the kernel memory issue — clearly not to drop
> measurement records and break attestation.
>
>> In order to minimize the locking time of ima_extend_list_mutex, deleting
>> N entries is realized by doing a lockless walk in the current measurements
>> list to determine the N-th entry to cut, to cut the current measurements
>> list under the lock, and by deleting the excess entries after releasing the
>> lock.
>>
>> Flushing the hash table is not supported for N entries, since it would
>> require removing the N entries one by one from the hash table under the
>> ima_extend_list_mutex lock, which would increase the locking time.
>>
>> The ima_extend_list_mutex lock is necessary in ima_dump_measurement_list()
>> because ima_queue_delete_partial() uses __list_cut_position() to modify
>> ima_measurements, for which no RCU-safe variant exists. For the staging
>> with prompt flavor alone, list_replace_rcu() could have been used instead,
>> but since both flavors share the same kexec serialization path, the mutex
>> is required regardless.
> Thank you for the clear explanation for the changes and limitations required to
> support this feature.
>
> The changes needed for supporting "stage and delete N" measurement records
> should be limited to this patch.  Patch 9/13 should have used
> list_replace_rcu(), without the mutex_lock.
>
>> Link: https://github.com/linux-integrity/linux/issues/1
>> Suggested-by: Steven Chen <chenste@linux.microsoft.com>
>> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
> Otherwise,
>
> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>

Tested-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Steven Chen <chenste@linux.microsoft.com>


^ permalink raw reply

* Re: [PATCH v4 0/2] Delete task_euid()
From: Paul Moore @ 2026-06-01 23:13 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman, Shuah Khan,
	Alex Shi, Yanteng Si, Dongliang Mu, Miguel Ojeda, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Danilo Krummrich, Jann Horn, linux-security-module,
	linux-doc, linux-kernel, rust-for-linux
In-Reply-To: <20260529-remove-task-euid-v4-0-07cbdf3af980@google.com>

On Fri, May 29, 2026 at 5:33 AM Alice Ryhl <aliceryhl@google.com> wrote:
>
> The task_euid() method is a very weird method, and Binder was the only
> user. As of commit 65b672152289 ("binder: use current_euid() for
> transaction sender identity") Binder doesn't use task_euid() anymore,
> so we can delete this method.

Given the problems from last time, it seems like it might be prudent
to let the commit have some time to "breathe" in a proper release, I'd
suggest merging this not for the upcoming v7.2 merge window but
instead waiting for v7.3.

> My suggestion would be to merge this through the LSM tree.

That's fine with me.  I'd also suggest updating the commit description
in patch 1/2 to indicate that binder is no longer using task_euid();
it currently reads like it is still being used.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v3 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via F_SETOWN to invoker's pgid
From: Günther Noack @ 2026-06-01 22:08 UTC (permalink / raw)
  To: hexlabsecurity
  Cc: Mickaël Salaün, Justin Suess, gnoack@google.com,
	linux-security-module@vger.kernel.org, stable@vger.kernel.org
In-Reply-To: <7rvmLIHR1Zh8RDF1IY1-SYRHzErgw9gPHq0k98RLYVsmHqAejjxcuJi8V3QaSbW-SnNvY5tfM2Xn_S1dEajKV_f7iyitoPwJgOSTZQ0nytc=@proton.me>

On Fri, May 29, 2026 at 07:07:30PM +0000, hexlabsecurity@proton.me wrote:
> From b5fdc79ce1cb2881d59dfed01d3d9170306be9e8 Mon Sep 17 00:00:00 2001
> From: Bryam Vargas <hexlabsecurity@proton.me>
> Date: Fri, 29 May 2026 12:49:41 -0500
> Subject: [PATCH v3 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass via
>  F_SETOWN to invoker's pgid
> 
> A Landlock-restricted process can bypass LANDLOCK_SCOPE_SIGNAL on the
> SIGIO delivery path and deliver arbitrary signals (including SIGKILL via
> F_SETSIG) to non-Landlocked targets that share its pgid, by exploiting a
> producer-side cache-vs-live evaluation gap.
> 
> The SIGIO path in hook_file_send_sigiotask() consults a cached subject
> stored in landlock_file(file)->fown_subject at fcntl(F_SETOWN) time
> (via hook_file_set_fowner()), instead of evaluating the live Landlock
> domain of the invoking task at signal-send time. The capture is gated
> by control_current_fowner(), which returns false (skipping capture)
> when pid_task(fown->pid, fown->pid_type) is in current's thread group.
> 
> This is correct for PIDTYPE_TGID / PIDTYPE_PID, where the target is a
> single task sharing current's cred. It is unsafe for PIDTYPE_PGID and
> PIDTYPE_SID: when current is at the head of its pgid hlist -- the
> default placement after fork(), hlist_add_head_rcu() in kernel/fork.c --
> pid_task(pgid, PIDTYPE_PGID) resolves to current itself,
> same_thread_group(current, current) is true, the capture is skipped, and
> fown_subject.domain stays NULL. hook_file_send_sigiotask() then
> short-circuits at "if (!subject->domain) return 0;", letting the kernel
> fan the signal out to every member of the group, including tasks outside
> current's Landlock domain that SCOPE_SIGNAL is supposed to protect.
> 
> The direct kill() path (hook_task_kill) is unaffected: it evaluates
> current's live domain on every call. Only the cached SIGIO path is
> broken.
> 
> Tighten control_current_fowner() to apply the thread-group exemption
> only when the target identifies a single task whose Landlock cred is
> necessarily shared with current (PIDTYPE_TGID, PIDTYPE_PID). For
> PIDTYPE_PGID and PIDTYPE_SID, always capture the current Landlock
> subject so the consumer's scope check runs against every member of the
> group at delivery time.
> 
> Stable kernels before the fown_subject conversion store the domain in
> landlock_file(file)->fown_domain; control_current_fowner() is identical
> there, so the same exemption and the same fix apply.
> 
> Fixes: 18eb75f3af40 ("landlock: Always allow signals between threads of the same process")
> Cc: stable@vger.kernel.org
> Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
> Tested-by: Justin Suess <utilityemal77@gmail.com>
> Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
> ---
>  security/landlock/fs.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index c1ecfe239032..edaa52572cbd 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -1909,6 +1909,18 @@ static bool control_current_fowner(struct fown_struct *const fown)
>  	if (!p)
>  		return true;
>  
> +	/*
> +	 * For PIDTYPE_PGID and PIDTYPE_SID, signal delivery fans out to
> +	 * every member of the group at SIGIO time. Even when pid_task()
> +	 * resolves to current itself (e.g., current is the pgid hlist
> +	 * head post-fork), non-current members of the group are still
> +	 * valid targets that must be checked by hook_file_send_sigiotask().
> +	 * Always capture the current subject for those types so the
> +	 * consumer scope check runs against the live fown_subject.
> +	 */
> +	if (fown->pid_type == PIDTYPE_PGID || fown->pid_type == PIDTYPE_SID)
> +		return true;
> +
>  	return !same_thread_group(p, current);
>  }

The reason why the same_thread_group() check exists is so that Go
programs that had to use libpsx instead of TSYNC had a way to signal
their own OS threads at the C level (a feature used by linked C
libraries and specifically by libpsx itself, so it prevented nested
Landlock domains).

(a) On Linux 7.0, the Go-Landlock library automatically uses TSYNC so
    this is not a problem any more.

(b) On earlier Linux versions

    * libpsx signaling is also going to continue working,
      because it uses normal signals instead of SIGIO

    * other libraries are also likely to continue working, unless they
      use the somewhat obscure SIGIO with PIDTYPE_PGID or PIDTYPE_SID.

    There is little incentive to use SIGIO in a pure Go program, as
    the runtime already implements file descriptor polling logic (with
    epoll, which is anyway a better choice)

So, this looks fine from the Go perspective; I doubt that this has
practical implications for Go.

Thank you for spotting this and providing a fix! 🙏

–Günther

^ permalink raw reply

* Re: [PATCH 01/11] params: bound array element output to the caller's page buffer
From: Matthew Wilcox @ 2026-06-01 20:23 UTC (permalink / raw)
  To: David Laight
  Cc: Kees Cook, Luis Chamberlain, Pengpeng Hou, stable, Petr Pavlu,
	Richard Weinberger, Anton Ivanov, Johannes Berg,
	Rafael J. Wysocki, Len Brown, Corey Minyard, Gabriel Somlo,
	Michael S. Tsirkin, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Tvrtko Ursulin, David Airlie, Simona Vetter, Bart Van Assche,
	Jason Gunthorpe, Leon Romanovsky, Laurent Pinchart, Hans de Goede,
	Mauro Carvalho Chehab, Bjorn Helgaas, Hannes Reinecke,
	James E.J. Bottomley, Martin K. Petersen, Daniel Lezcano,
	Zhang Rui, Lukasz Luba, Greg Kroah-Hartman, Jiri Slaby,
	Alan Stern, Jason Wang, Xuan Zhuo, Eugenio Pérez,
	Jason Baron, Jim Cromie, Tiwei Bie, Benjamin Berg,
	Ilpo Järvinen, David E. Box, Maciej W. Rozycki,
	Srinivas Pandruvada, Peter Zijlstra, Heiko Carstens,
	Vasily Gorbik, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Vinod Koul, Frank Li, Daniel Gomez, Sami Tolvanen,
	Aaron Tomlin, Alexander Potapenko, Marco Elver, Dmitry Vyukov,
	Andrew Morton, John Johansen, Paul Moore, James Morris,
	Serge E. Hallyn, Andy Shevchenko, Georgia Garcia, kvm, dmaengine,
	linux-modules, kasan-dev, linux-mm, apparmor,
	linux-security-module, linux-um, linux-acpi, openipmi-developer,
	qemu-devel, intel-gfx, dri-devel, linux-rdma, linux-media,
	linux-pci, linux-scsi, linux-pm, linuxppc-dev, linux-serial,
	linux-usb, usb-storage, virtualization, linux-kernel, linux-arch,
	netdev, linux-fsdevel, linux-hardening
In-Reply-To: <20260521174631.71a06440@pumpkin>

On Thu, May 21, 2026 at 05:46:31PM +0100, David Laight wrote:
> On Thu, 21 May 2026 06:33:14 -0700
> Kees Cook <kees@kernel.org> wrote:
> > Collect each element into a temporary PAGE_SIZE buffer first and then
> > copy only the remaining space into the caller's page buffer.
> 
> Should this be using a 4k buffer on all architectures?
> Initially perhaps just using a different name for the constant until
> all the associated PAGE_SIZE limits have been removed.

If we're acually going to think about this, even 4KiB is too big.
An 80x25 terminal is 2000 bytes (assuming no utf8), so 4KiB is two
entire screenfuls.  Limiting to 2048 would seem reasonable to me.

^ permalink raw reply

* Re: [PATCH 00/11] Convert moduleparams to seq_buf
From: Kees Cook @ 2026-06-01 19:59 UTC (permalink / raw)
  To: Petr Pavlu
  Cc: Luis Chamberlain, Pengpeng Hou, Richard Weinberger, Anton Ivanov,
	Johannes Berg, Rafael J. Wysocki, Len Brown, Corey Minyard,
	Gabriel Somlo, Michael S. Tsirkin, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Tvrtko Ursulin, David Airlie, Simona Vetter,
	Bart Van Assche, Jason Gunthorpe, Leon Romanovsky,
	Laurent Pinchart, Hans de Goede, Mauro Carvalho Chehab,
	Bjorn Helgaas, Hannes Reinecke, James E.J. Bottomley,
	Martin K. Petersen, Daniel Lezcano, Zhang Rui, Lukasz Luba,
	Greg Kroah-Hartman, Jiri Slaby, Alan Stern, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, Jason Baron, Jim Cromie, Tiwei Bie,
	Benjamin Berg, Ilpo Järvinen, David E. Box,
	Maciej W. Rozycki, Srinivas Pandruvada, Peter Zijlstra,
	Heiko Carstens, Vasily Gorbik, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Vinod Koul, Frank Li, Daniel Gomez, Sami Tolvanen,
	Aaron Tomlin, Alexander Potapenko, Marco Elver, Dmitry Vyukov,
	Andrew Morton, John Johansen, Paul Moore, James Morris,
	Serge E. Hallyn, Andy Shevchenko, Georgia Garcia, kvm, dmaengine,
	linux-modules, kasan-dev, linux-mm, apparmor,
	linux-security-module, linux-um, linux-acpi, openipmi-developer,
	qemu-devel, intel-gfx, dri-devel, linux-rdma, linux-media,
	linux-pci, linux-scsi, linux-pm, linuxppc-dev, linux-serial,
	linux-usb, usb-storage, virtualization, linux-kernel, linux-arch,
	netdev, linux-fsdevel, linux-hardening
In-Reply-To: <88c5ca1d-eeda-4023-bc7a-397b92780db9@suse.com>

On Tue, May 26, 2026 at 08:53:06AM +0200, Petr Pavlu wrote:
> On 5/21/26 3:33 PM, Kees Cook wrote:
> > Hi,
> > 
> > I tried to trim the CC list here, but it's still pretty huge...
> > 
> > We've had a long-standing issue with "write to a string pointer" callbacks
> > that don't bounds check the destination (and for which the bounds is
> > also not part of the callback prototype, even if it is "known" to be
> > PAGE_SIZE, which sysfs_emit() depends on). Both moduleparams and sysfs
> > use this pattern. As a first step, and to test the migration method,
> > migrate moduleparams first.
> > 
> > There are 2 "mechanical" treewide patches that are handled by Coccinelle:
> > - treewide: Convert struct kernel_param_ops initializers to DEFINE_KERNEL_PARAM_OPS
> > - treewide: Convert custom kernel_param_ops .get callbacks to seq_buf via cocci
> > 
> > The last treewide patch is manual, and may need to be broken up into
> > per-subsystem patches, though I'd prefer to avoid this, as it would
> > extend the migration from 1 relase to at least 2 releases. (1 to
> > release the migration infrastructure, then 1 release to collect all the
> > subsystem changes, and possibly 1 more release to remove the migration
> > infrastructure.)
> > 
> > Thoughts, questions?
> 
> This looks reasonable to me. I added a few minor comments on the patches
> but they already look solid.

Thanks for the review! I'll get a v2 prepared with your notes addressed. :)

-Kees

-- 
Kees Cook

^ permalink raw reply

* Re: security_task_prctl: why -ENOSYS
From: Serge E. Hallyn @ 2026-06-01 19:33 UTC (permalink / raw)
  To: William Roberts; +Cc: Serge E. Hallyn, Casey Schaufler, LSM, SElinux list
In-Reply-To: <CAFftDdqyTrr=wS2hUT1EvAqVsbegPRh9Y-rH5S3VEpvXbJ6QRg@mail.gmail.com>

On Mon, Jun 01, 2026 at 02:01:07PM -0500, William Roberts wrote:
> <snip>
> >
> > How about security_task_prctl_allowed()?  (Mirroring security_uring_*)
> >
> > Renaming the existing hook security_task_prctl_handle() also wouldn't
> > be too bad, but that's probably more churn than it's worth.
> >
> 
> Yeah if something else is already done, I'll just copy their
> convention. I went with suffix _check for now.
> 
> Another possible issue that I think other upstream communities may
> have with adding an additional hook, is that there will be two hooks
> in the prctl syscall path,
> I am not sure if that's a show stopper?

It's not a hot path, so I wouldn't think so.

^ permalink raw reply

* Re: security_task_prctl: why -ENOSYS
From: William Roberts @ 2026-06-01 19:01 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Casey Schaufler, LSM, SElinux list
In-Reply-To: <ahoaFz8DMMvWgqL+@mail.hallyn.com>

<snip>
>
> How about security_task_prctl_allowed()?  (Mirroring security_uring_*)
>
> Renaming the existing hook security_task_prctl_handle() also wouldn't
> be too bad, but that's probably more churn than it's worth.
>

Yeah if something else is already done, I'll just copy their
convention. I went with suffix _check for now.

Another possible issue that I think other upstream communities may
have with adding an additional hook, is that there will be two hooks
in the prctl syscall path,
I am not sure if that's a show stopper?

^ permalink raw reply

* [PATCH v5 4/4] tpm: tpm_crb_ffa: revert defered_probed when tpm_crb_ffa is built-in
From: Yeoreum Yun @ 2026-06-01 14:27 UTC (permalink / raw)
  To: linux-security-module, linux-kernel, linux-integrity
  Cc: paul, zohar, roberto.sassu, noodles, jarkko, sudeep.holla,
	jmorris, serge, dmitry.kasatkin, eric.snowberg, jgg, Yeoreum Yun
In-Reply-To: <20260601142749.3379697-1-yeoreum.yun@arm.com>

commit 746d9e9f62a6 ("tpm: tpm_crb_ffa: try to probe tpm_crb_ffa when it's build_in")
probe tpm_crb_ffa forcefully when it's built-in to integrate with IMA.

However, IMA now provides the IMA_INIT_LATE_SYNC build option, which
initialises IMA at the late_initcall_sync level, so this change is no
longer required.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
 drivers/char/tpm/tpm_crb_ffa.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/char/tpm/tpm_crb_ffa.c b/drivers/char/tpm/tpm_crb_ffa.c
index 99f1c1e5644b..025c4d4b17ca 100644
--- a/drivers/char/tpm/tpm_crb_ffa.c
+++ b/drivers/char/tpm/tpm_crb_ffa.c
@@ -177,23 +177,13 @@ static int tpm_crb_ffa_to_linux_errno(int errno)
  */
 int tpm_crb_ffa_init(void)
 {
-	int ret = 0;
-
-	if (!IS_MODULE(CONFIG_TCG_ARM_CRB_FFA)) {
-		ret = ffa_register(&tpm_crb_ffa_driver);
-		if (ret) {
-			tpm_crb_ffa = ERR_PTR(-ENODEV);
-			return ret;
-		}
-	}
-
 	if (!tpm_crb_ffa)
-		ret = -ENOENT;
+		return -ENOENT;
 
 	if (IS_ERR_VALUE(tpm_crb_ffa))
-		ret = -ENODEV;
+		return -ENODEV;
 
-	return ret;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(tpm_crb_ffa_init);
 
@@ -405,9 +395,7 @@ static struct ffa_driver tpm_crb_ffa_driver = {
 	.id_table = tpm_crb_ffa_device_id,
 };
 
-#ifdef MODULE
 module_ffa_driver(tpm_crb_ffa_driver);
-#endif
 
 MODULE_AUTHOR("Arm");
 MODULE_DESCRIPTION("TPM CRB FFA driver");
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}


^ permalink raw reply related

* [PATCH v5 3/4] security: ima: rename boot_aggregate when ima is initialised at late_sync
From: Yeoreum Yun @ 2026-06-01 14:27 UTC (permalink / raw)
  To: linux-security-module, linux-kernel, linux-integrity
  Cc: paul, zohar, roberto.sassu, noodles, jarkko, sudeep.holla,
	jmorris, serge, dmitry.kasatkin, eric.snowberg, jgg,
	Jonathan McDowell
In-Reply-To: <20260601142749.3379697-1-yeoreum.yun@arm.com>

From: Jonathan McDowell <noodles@meta.com>

The Linux IMA (Integrity Measurement Architecture) subsystem used for
secure boot, file integrity, or remote attestation cannot be a loadable
module for few reasons listed below:

 o Boot-Time Integrity: IMA’s main role is to measure and appraise files
   before they are used. This includes measuring critical system files
   during early boot (e.g., init, init scripts, login binaries). If IMA
   were a module, it would be loaded too late to cover those.

 o TPM Dependency: IMA integrates tightly with the TPM to record
   measurements into PCRs. The TPM must be initialized early (ideally
   before init_ima()), which aligns with IMA being built-in.

 o Security Model: IMA is part of a Trusted Computing Base (TCB). Making
   it a module would weaken the security model, as a potentially
   compromised system could delay or tamper with its initialization.

IMA must be built-in to ensure it starts measuring from the earliest
possible point in boot which inturn implies TPM must be initialised and
ready to use before IMA.

Unfortunately some TPM drivers (such as Arm FF-A, or SPI attached TPM
devices) are not reliably available during the initcall_late stage,
resulting in a log error:

  ima: No TPM chip found, activating TPM-bypass!

To address this issue, IMA_INIT_LATE_SYNC is introduced.
However, a remote attestation service cannot determine when IMA has been
initialized because the boot_aggregate measurement name remains unchanged,
even though IMA is initialized later at late_initcall_sync when
IMA_INIT_LATE_SYNC is enabled.

Therefore, use a distinct boot_aggregate name when IMA_INIT_LATE_SYNC
is enabled, allowing the remote attestation service to identify
when IMA has been initialized.

Signed-off-by: Jonathan McDowell <noodles@meta.com>
[yeoreum.yun@arm.com: modified to align with the IMA_INIT_LATE_SYNC change]
---
 security/integrity/ima/ima.h              |  1 +
 security/integrity/ima/ima_init.c         | 15 +++++++++++----
 security/integrity/ima/ima_template_lib.c |  3 ++-
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 69e9bf0b82c6..194b195cec1e 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -66,6 +66,7 @@ extern struct ima_algo_desc *ima_algo_array __ro_after_init;
 extern int ima_appraise;
 extern struct tpm_chip *ima_tpm_chip;
 extern const char boot_aggregate_name[];
+extern const char boot_aggregate_late_name[];
 
 /* IMA event related data */
 struct ima_event_data {
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index a2f34f2d8ad7..4c24bd535466 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -22,6 +22,7 @@
 
 /* name for boot aggregate entry */
 const char boot_aggregate_name[] = "boot_aggregate";
+const char boot_aggregate_late_name[] = "boot_aggregate_late";
 struct tpm_chip *ima_tpm_chip;
 
 /* Add the boot aggregate to the IMA measurement list and extend
@@ -45,11 +46,11 @@ static int __init ima_add_boot_aggregate(void)
 	const char *audit_cause = "ENOMEM";
 	struct ima_template_entry *entry;
 	struct ima_iint_cache tmp_iint, *iint = &tmp_iint;
-	struct ima_event_data event_data = { .iint = iint,
-					     .filename = boot_aggregate_name };
+	struct ima_event_data event_data = { .iint = iint };
 	struct ima_max_digest_data hash;
 	struct ima_digest_data *hash_hdr = container_of(&hash.hdr,
 						struct ima_digest_data, hdr);
+	const char *filename;
 	int result = -ENOMEM;
 	int violation = 0;
 
@@ -59,6 +60,12 @@ static int __init ima_add_boot_aggregate(void)
 	iint->ima_hash->algo = ima_hash_algo;
 	iint->ima_hash->length = hash_digest_size[ima_hash_algo];
 
+	if (IS_ENABLED(CONFIG_IMA_INIT_LATE_SYNC))
+		filename = boot_aggregate_late_name;
+	else
+		filename = boot_aggregate_name;
+	event_data.filename = filename;
+
 	/*
 	 * With TPM 2.0 hash agility, TPM chips could support multiple TPM
 	 * PCR banks, allowing firmware to configure and enable different
@@ -86,7 +93,7 @@ static int __init ima_add_boot_aggregate(void)
 	}
 
 	result = ima_store_template(entry, violation, NULL,
-				    boot_aggregate_name,
+				    filename,
 				    CONFIG_IMA_MEASURE_PCR_IDX);
 	if (result < 0) {
 		ima_free_template_entry(entry);
@@ -95,7 +102,7 @@ static int __init ima_add_boot_aggregate(void)
 	}
 	return 0;
 err_out:
-	integrity_audit_msg(AUDIT_INTEGRITY_PCR, NULL, boot_aggregate_name, op,
+	integrity_audit_msg(AUDIT_INTEGRITY_PCR, NULL, filename, op,
 			    audit_cause, result, 0);
 	return result;
 }
diff --git a/security/integrity/ima/ima_template_lib.c b/security/integrity/ima/ima_template_lib.c
index 0e627eac9c33..8a89236f926c 100644
--- a/security/integrity/ima/ima_template_lib.c
+++ b/security/integrity/ima/ima_template_lib.c
@@ -363,7 +363,8 @@ int ima_eventdigest_init(struct ima_event_data *event_data,
 		goto out;
 	}
 
-	if ((const char *)event_data->filename == boot_aggregate_name) {
+	if ((const char *)event_data->filename == boot_aggregate_name ||
+	    (const char *)event_data->filename == boot_aggregate_late_name) {
 		if (ima_tpm_chip) {
 			hash.hdr.algo = HASH_ALGO_SHA1;
 			result = ima_calc_boot_aggregate(hash_hdr);
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}


^ permalink raw reply related

* [PATCH v5 2/4] security: ima: introduce IMA_INIT_LATE_SYNC option
From: Yeoreum Yun @ 2026-06-01 14:27 UTC (permalink / raw)
  To: linux-security-module, linux-kernel, linux-integrity
  Cc: paul, zohar, roberto.sassu, noodles, jarkko, sudeep.holla,
	jmorris, serge, dmitry.kasatkin, eric.snowberg, jgg, Yeoreum Yun
In-Reply-To: <20260601142749.3379697-1-yeoreum.yun@arm.com>

To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
the TPM driver must be built as built-in and
must be probed before the IMA subsystem is initialized.

However, when the TPM device operates over the FF-A protocol using
the CRB interface, probing fails and returns -EPROBE_DEFER if
the tpm_crb_ffa device — an FF-A device that provides the communication
interface to the tpm_crb driver — has not yet been probed.

To ensure the TPM device operating over the FF-A protocol with
the CRB interface is probed before IMA initialization,
the following conditions must be met:

1. The corresponding ffa_device must be registered,
   which is done via ffa_init().

2. The tpm_crb_driver must successfully probe this device via
   tpm_crb_ffa_init().

3. The tpm_crb driver using CRB over FF-A can then
   be probed successfully. (See crb_acpi_add() and
   tpm_crb_ffa_init() for reference.)

Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
all registered with device_initcall, which means crb_acpi_driver_init() may
be invoked before ffa_init() and tpm_crb_ffa_init() are completed.

When this occurs, probing the TPM device is deferred.
However, the deferred probe can happen after the IMA subsystem
has already been initialized, since IMA initialization is performed
during late_initcall, and deferred_probe_initcall() is performed
at the same level.

And the similar situation is reported on TPM devices attached on SPI
bus[0].

To resolve this, introduce IMA_INIT_LATE_SYNC option to initialise
IMA at late_inicall_sync so that IMA is initialized with the TPM
device probed deferred.

When this option is enabled, modules that access files in the
initramfs through usermode helper calls such as request_module()
during initcall must not be built-in. Otherwise, IMA may miss
measuring those files [1].

Link: https://lore.kernel.org/all/aYXEepLhUouN5f99@earth.li/ [0]
Link: https://lore.kernel.org/all/2b3782398cc17ce9d355490a0c42ebce9120a9ae.camel@linux.ibm.com/ [1]
Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
 security/integrity/ima/Kconfig    | 10 ++++++++++
 security/integrity/ima/ima_main.c |  4 ++++
 2 files changed, 14 insertions(+)

diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index 862fbee2b174..75f71401fba3 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -332,4 +332,14 @@ config IMA_KEXEC_EXTRA_MEMORY_KB
 	  If set to the default value of 0, an extra half page of memory for those
 	  additional measurements will be allocated.

+config IMA_INIT_LATE_SYNC
+	bool "Initialise IMA at late_initcall_sync"
+	default n
+	help
+	  This option initialises IMA at late_initcall_sync for platforms
+	  where TPM device probing is deferred.
+	  When this option is enabled, modules that access files in the
+	  initramfs through usermode helper calls such as request_module()
+	  during initcall must not be built-in. Otherwise, IMA may miss
+	  file measurements for them.
 endif
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 5cea53fc36df..1cfae4b83dc5 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -1337,5 +1337,9 @@ DEFINE_LSM(ima) = {
 	.order = LSM_ORDER_LAST,
 	.blobs = &ima_blob_sizes,
 	/* Start IMA after the TPM is available */
+#ifndef CONFIG_IMA_INIT_LATE_SYNC
 	.initcall_late = init_ima,
+#else
+	.initcall_late_sync = init_ima,
+#endif
 };
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}

^ permalink raw reply related

* [PATCH v5 1/4] security: lsm: allow LSMs to register for late_initcall_sync init
From: Yeoreum Yun @ 2026-06-01 14:27 UTC (permalink / raw)
  To: linux-security-module, linux-kernel, linux-integrity
  Cc: paul, zohar, roberto.sassu, noodles, jarkko, sudeep.holla,
	jmorris, serge, dmitry.kasatkin, eric.snowberg, jgg, Yeoreum Yun
In-Reply-To: <20260601142749.3379697-1-yeoreum.yun@arm.com>

There are situations where LSMs have dependencies that might mean they
want to be initialised later in the boot process, to ensure those
dependencies are available. In particular there are some TPM setups (Arm
FF-A devices, SPI attached TPMs) required by IMA which are not
guaranteed to be initialised for regular initcall_late.

Add an initcall_late_sync option that can be used in these situations.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
 include/linux/lsm_hooks.h |  2 ++
 security/lsm_init.c       | 13 +++++++++++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index b4f8cad53ddb..c4488c4a6d8a 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -167,6 +167,7 @@ enum lsm_order {
  * @initcall_fs: LSM callback for fs_initcall setup, optional
  * @initcall_device: LSM callback for device_initcall() setup, optional
  * @initcall_late: LSM callback for late_initcall() setup, optional
+ * @initcall_late_sync: LSM callback for late_initcall_sync() setup, optional
  */
 struct lsm_info {
 	const struct lsm_id *id;
@@ -182,6 +183,7 @@ struct lsm_info {
 	int (*initcall_fs)(void);
 	int (*initcall_device)(void);
 	int (*initcall_late)(void);
+	int (*initcall_late_sync)(void);
 };
 
 #define DEFINE_LSM(lsm)							\
diff --git a/security/lsm_init.c b/security/lsm_init.c
index 7c0fd17f1601..a1ad641811de 100644
--- a/security/lsm_init.c
+++ b/security/lsm_init.c
@@ -556,13 +556,22 @@ device_initcall(security_initcall_device);
  * security_initcall_late - Run the LSM late initcalls
  */
 static int __init security_initcall_late(void)
+{
+	return lsm_initcall(late);
+}
+late_initcall(security_initcall_late);
+
+/**
+ * security_initcall_late_sync - Run the LSM late initcalls sync
+ */
+static int __init security_initcall_late_sync(void)
 {
 	int rc;
 
-	rc = lsm_initcall(late);
+	rc = lsm_initcall(late_sync);
 	lsm_pr_dbg("all enabled LSMs fully activated\n");
 	call_blocking_lsm_notifier(LSM_STARTED_ALL, NULL);
 
 	return rc;
 }
-late_initcall(security_initcall_late);
+late_initcall_sync(security_initcall_late_sync);
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}


^ permalink raw reply related

* [PATCH v5 0/4] introduce IMA_INIT_LATE_SYNC option
From: Yeoreum Yun @ 2026-06-01 14:27 UTC (permalink / raw)
  To: linux-security-module, linux-kernel, linux-integrity
  Cc: paul, zohar, roberto.sassu, noodles, jarkko, sudeep.holla,
	jmorris, serge, dmitry.kasatkin, eric.snowberg, jgg, Yeoreum Yun

To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
the TPM driver must be built as built-in and
must be probed before the IMA subsystem is initialized.

However, when the TPM device operates over the FF-A protocol using
the CRB interface, probing fails and returns -EPROBE_DEFER if
the tpm_crb_ffa device — an FF-A device that provides the communication
interface to the tpm_crb driver — has not yet been probed.

To ensure the TPM device operating over the FF-A protocol with
the CRB interface is probed before IMA initialization,
the following conditions must be met:

1. The corresponding ffa_device must be registered,
   which is done via ffa_init().

2. The tpm_crb_driver must successfully probe this device via
   tpm_crb_ffa_init().

3. The tpm_crb driver using CRB over FF-A can then
   be probed successfully. (See crb_acpi_add() and
   tpm_crb_ffa_init() for reference.)

Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
all registered with device_initcall, which means crb_acpi_driver_init() may
be invoked before ffa_init() and tpm_crb_ffa_init() are completed.

When this occurs, probing the TPM device is deferred.
However, the deferred probe can happen after the IMA subsystem
has already been initialized, since IMA initialization is performed
during late_initcall, and deferred_probe_initcall() is performed
at the same level.

And the similar situation is reported on TPM devices attached on SPI
bus[0].

To resolve this, introduce IMA_INIT_LATE_SYNC option to initialise
IMA at late_inicall_sync so that IMA is initialized with the TPM
device probed defered.

When this option is enabled, modules that access files in the
initramfs through usermode helper calls such as request_module()
during initcall must not be built-in. Otherwise, IMA may miss
measuring those files since they're the file accesses before the
initialisation of IMA [1].

Link: https://lore.kernel.org/all/aYXEepLhUouN5f99@earth.li/ [0]
Link: https://lore.kernel.org/all/2b3782398cc17ce9d355490a0c42ebce9120a9ae.camel@linux.ibm.com/ [1]

Patch history
=============
from v4 to v5:
  - rebase on v7.1-rc6
  - apply boot_aggreate name patch from @Jonathan and align it with
    IMA_INIT_LATE_SYNC option.
  - https://lore.kernel.org/all/20260525075404.3480282-1-yeoreum.yun@arm.com/

from v3 to v4:
  - rebase on v7.1-rc5
  - introduce IMA_INIT_LATE_SYNC option to control IMA initailisation.
  - https://lore.kernel.org/all/cover.1777036497.git.noodles@meta.com/

from v2 to v3:
  - Drop ff-a/pKVM diff (this seems to have a separate set of
    discussion)
  - Rework IMA delayed initialisation to avoid delaying when unnecessary
  - Ensure IMA log clearly indicates when we've initialised late
  - https://lore.kernel.org/all/20260422162449.1814615-1-yeoreum.yun@arm.com/

from v1 to v2:
  - add notifier to make ffa-driver pkvm initialised.
  - modify to try initailisation again when IMA coudln't find proper TPM device.
  - https://lore.kernel.org/all/20260417175759.3191279-1-yeoreum.yun@arm.com/#t

Jonathan McDowell (1):
  security: ima: rename boot_aggregate when ima is initialised at
    late_sync

Yeoreum Yun (3):
  security: lsm: allow LSMs to register for late_initcall_sync init
  security: ima: introduce IMA_INIT_LATE_SYNC option
  tpm: tpm_crb_ffa: revert defered_probed when tpm_crb_ffa is built-in

 drivers/char/tpm/tpm_crb_ffa.c            | 18 +++---------------
 include/linux/lsm_hooks.h                 |  2 ++
 security/integrity/ima/Kconfig            | 10 ++++++++++
 security/integrity/ima/ima.h              |  1 +
 security/integrity/ima/ima_init.c         | 15 +++++++++++----
 security/integrity/ima/ima_main.c         |  4 ++++
 security/integrity/ima/ima_template_lib.c |  3 ++-
 security/lsm_init.c                       | 13 +++++++++++--
 8 files changed, 44 insertions(+), 22 deletions(-)

base-commit: e43ffb69e0438cddd72aaa30898b4dc446f664f8
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}

^ permalink raw reply

* Re: [PATCH v5 12/13] ima: Return error on deleting measurements already copied during kexec
From: Mimi Zohar @ 2026-06-01 13:47 UTC (permalink / raw)
  To: Roberto Sassu, corbet, skhan, dmitry.kasatkin, eric.snowberg,
	paul, jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, chenste, nramas, Roberto Sassu
In-Reply-To: <8a0c965e1c2f3eee1006c4941206d70a71e7d0f0.camel@huaweicloud.com>

On Fri, 2026-05-29 at 16:59 +0200, Roberto Sassu wrote:
> On Tue, 2026-05-26 at 10:02 -0400, Mimi Zohar wrote:
> > On Wed, 2026-04-29 at 18:03 +0200, Roberto Sassu wrote:
> > > From: Roberto Sassu <roberto.sassu@huawei.com>
> > > 
> > > Refuse to delete staged or active list measurements, if a kexec racing with
> > > the deletion already copied those measurements in the kexec buffer. In this
> > > way, user space becomes aware that those measurements are going to appear
> > > in the secondary kernel, and thus they don't have to be saved twice.
> > 
> > There are two reboot notifiers: one to prevent additional measurements extending
> > the TPM, while the other copies the measurements for kexec.  This patch prevents
> > deleting the staged measurements after the latter notifier.
> > 
> > Instead of introducing a specific method for detecting whether the measurement
> > list has been copied, rely on one of the two existing reboot notifiers. The
> > simplest method would test "ima_measurements_suspended", which would prevent
> > deleting the staged measurements a bit earlier.
> 
> Testing that the reboot notifier fired (with the
> ima_measurements_suspended variable) is not enough to know whether the
> measurements dump took place or not.
> 
> We need a flag (one is enough) protected by ima_extend_list_mutex, so
> that we know reliably which event occurred first, or the dump or the
> staging/delete (which are also protected by ima_extend_list_mutex).

I'm suggesting not allowing the staged measurements, if there are any, to be
deleted once the reboot notifier has started. They'll be copied at the late
reboot notifier.

Mimi

^ permalink raw reply

* Re: [PATCH v4 3/3] tpm: tpm_crb_ffa: revert defered_probed when tpm_crb_ffa is built-in
From: Yeoreum Yun @ 2026-06-01 10:01 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Jarkko Sakkinen, linux-security-module, linux-kernel,
	linux-integrity, paul, zohar, roberto.sassu, noodles, jmorris,
	serge, dmitry.kasatkin, eric.snowberg, jgg
In-Reply-To: <20260601-shiny-steel-jellyfish-b38b6e@sudeepholla>

> On Mon, Jun 01, 2026 at 08:17:13AM +0100, Yeoreum Yun wrote:
> > Hi Jarkko,
> > 
> > Sorry for late answer.
> > 
> > > On Mon, May 25, 2026 at 08:54:04AM +0100, Yeoreum Yun wrote:
> > > > commit 746d9e9f62a6 ("tpm: tpm_crb_ffa: try to probe tpm_crb_ffa when it's build_in")
> > > > probe tpm_crb_ffa forcefully when it's built-in to integrate with IMA.
> > > > 
> > > > However, IMA now provides the IMA_INIT_LATE_SYNC build option, which
> > > > initialises IMA at the late_initcall_sync level, so this change is no
> > > > longer required.
> > > > 
> > > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > > ---
> > > >  drivers/char/tpm/tpm_crb_ffa.c | 18 +++---------------
> > > >  1 file changed, 3 insertions(+), 15 deletions(-)
> > > > 
> > > > diff --git a/drivers/char/tpm/tpm_crb_ffa.c b/drivers/char/tpm/tpm_crb_ffa.c
> > > > index 99f1c1e5644b..025c4d4b17ca 100644
> > > > --- a/drivers/char/tpm/tpm_crb_ffa.c
> > > > +++ b/drivers/char/tpm/tpm_crb_ffa.c
> > > > @@ -177,23 +177,13 @@ static int tpm_crb_ffa_to_linux_errno(int errno)
> > > >   */
> > > >  int tpm_crb_ffa_init(void)
> > > >  {
> > > > -	int ret = 0;
> > > > -
> > > > -	if (!IS_MODULE(CONFIG_TCG_ARM_CRB_FFA)) {
> > > > -		ret = ffa_register(&tpm_crb_ffa_driver);
> > > > -		if (ret) {
> > > > -			tpm_crb_ffa = ERR_PTR(-ENODEV);
> > > > -			return ret;
> > > > -		}
> > > > -	}
> > > > -
> > > >  	if (!tpm_crb_ffa)
> > > > -		ret = -ENOENT;
> > > > +		return -ENOENT;
> > > >  
> > > >  	if (IS_ERR_VALUE(tpm_crb_ffa))
> > > > -		ret = -ENODEV;
> > > > +		return -ENODEV;
> > > >  
> > > > -	return ret;
> > > > +	return 0;
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(tpm_crb_ffa_init);
> > > >  
> > > > @@ -405,9 +395,7 @@ static struct ffa_driver tpm_crb_ffa_driver = {
> > > >  	.id_table = tpm_crb_ffa_device_id,
> > > >  };
> > > >  
> > > > -#ifdef MODULE
> > > >  module_ffa_driver(tpm_crb_ffa_driver);
> > > > -#endif
> > > >  
> > > >  MODULE_AUTHOR("Arm");
> > > >  MODULE_DESCRIPTION("TPM CRB FFA driver");
> > > > -- 
> > > > LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
> > > > 
> > > 
> > > How we would sync up this patch? Through which tree etc.
> > 
> > IMHO, the IMA relevant thing would be into IMA tree,
> > However I think this patch would be much easier to sync into Sudeep's
> > FF-A tree where ff-a initilisation is reverted to device_initcall
> > unless you're uncomfortable.
> > 
> > For this, It might be better to split this patch from this series
> > since by above and defer probe of ff-a would make a register failure
> > of registering tpm_crb_ffa driver which is built-in.
> > 
> > @Sudeep what do you think?
> > 
> 
> IIRC, there is/was no dependency between these and FF-A patches that are
> queued in terms of build. I agree there may be dependency to get all the
> functionality but we can resort to linux-next for that. FF-A is not enabled
> in the defconfig, so anyone working on FF-A + TPM must enable then and can
> rely on -next IMHO.
> 
> That said, I have already sent PR for FF-A to SoC team and it is already
> queued for v7.2. I don't have any other plans unless they are fixes.

Thanks. Then I think it's enough to merge this patch to TPM tree
when this patchset is approved once.

-- 
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [PATCH v2 9/9] landlock: Add documentation for capability and namespace restrictions
From: Günther Noack @ 2026-06-01  9:37 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Paul Moore,
	Serge E . Hallyn, Daniel Durning, Jonathan Corbet, Justin Suess,
	Lennart Poettering, Mikhail Ivanov, Nicolas Bouchinet,
	Shervin Oloumi, Tingmao Wang, kernel-team, linux-fsdevel,
	linux-kernel, linux-security-module, Alejandro Colomar
In-Reply-To: <20260527181127.879771-10-mic@digikod.net>

On Wed, May 27, 2026 at 08:11:22PM +0200, Mickaël Salaün wrote:
> Document the two new Landlock permission categories in the userspace API
> guide, admin guide, and kernel security documentation.
> 
> The userspace API guide adds sections on capability restriction
> (LANDLOCK_PERM_CAPABILITY_USE with LANDLOCK_RULE_CAPABILITY) and
> namespace restriction (LANDLOCK_PERM_NAMESPACE_USE with
> LANDLOCK_RULE_NAMESPACE, covering creation, entry, and fd-reference
> acquisition), the backward-compatible degradation pattern for ABI < 10,
> and the per-namespace-type capability requirements.
> 
> The admin guide adds the new perm.namespace_use and perm.capability_use
> audit blocker names with their object identification fields
> (namespace_type, namespace_id, capability).
> 
> The kernel security documentation adds a "Ruleset restriction models"
> section defining the three models (handled_access_*, handled_perm,
> scoped), their coverage and compatibility properties, and the criteria
> for choosing between them for future features.  It also documents
> composability with user namespaces and adds kernel-doc references for
> the new capability and namespace headers.
> 
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Günther Noack <gnoack@google.com>
> Cc: Paul Moore <paul@paul-moore.com>
> Cc: Serge E. Hallyn <serge@hallyn.com>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
> 
> Changes since v1:
> https://lore.kernel.org/r/20260312100444.2609563-12-mic@digikod.net
> 
> The userspace API and security guides were revamped to match the v2
> permission model: the previous chokepoints/gateways prose is replaced
> with the per-object (handled_access_*) versus per-category
> (handled_perm) framing, and a new Design philosophy section in the
> security guide states Landlock's principle (data, processes, kernel
> resources).
> 
> - Rename namespace_inum to namespace_id in audit field documentation
>   to match the renamed audit field.
> - Rename LANDLOCK_PERM_NAMESPACE_ENTER references to
>   LANDLOCK_PERM_NAMESPACE_USE (companion change to the introducing
>   commit), and enumerate the seven kernel paths it gates in the
>   userspace API guide (membership via unshare/clone/clone3/setns; fd
>   reference via open_tree/fsmount).
> - Clarify that LANDLOCK_PERM_NAMESPACE_USE gates *acquisition* of
>   namespace associations only (namespaces the process is already a
>   member of when the domain is enforced are implicitly allowed) and
>   that LANDLOCK_PERM_CAPABILITY_USE gates every exercise of a
>   capability after the domain is enforced, regardless of how the
>   capability was obtained.
> - Document the rationale for accepting (rather than rejecting)
>   unknown category member values in rule bodies: rejection would tie
>   Landlock policy semantics to the running kernel's category-member
>   set, making cross-kernel policies brittle.  Acceptance is fail-safe
>   in both directions and lets a policy activate as written when a
>   value becomes real on a future kernel.
> - Replace handled_perm = 0 with a per-bit mask in the userspace API
>   guide's ABI compat fall-through, so future ABI extensions adding
>   new LANDLOCK_PERM_* bits do not get stripped on the path that
>   drops the v10 bits.
> - Add a bridging sentence in the per-category permissions section
>   of Documentation/security/landlock.rst contrasting per-category
>   permissions with per-object access rights: per-category gates the
>   prerequisite operation itself rather than restricting specific
>   operations on a single resource instance (suggested by Günther
>   Noack).
> - Disambiguate the orthogonality invariant in
>   Documentation/security/landlock.rst from the UAPI scoped field
>   ("all new scoped features" -> "all Landlock access controls";
>   suggested by Justin Suess).
> - Add an introductory paragraph in
>   Documentation/userspace-api/landlock.rst contrasting
>   LANDLOCK_PERM_CAPABILITY_USE with PR_SET_NO_NEW_PRIVS: NNP is the
>   broader mechanism that blocks privilege acquisition via execve(2),
>   while CAPABILITY_USE restricts the exercise of capabilities the
>   process already holds (including those gained via CLONE_NEWUSER,
>   which NNP does not block); sandboxes typically set both
>   (suggested by Justin Suess).
> - Disambiguate "category": object-side uses "object type" / "resource
>   kind"; "category" stays for the per-category permissions model.
> ---
>  Documentation/admin-guide/LSM/landlock.rst |  19 +-
>  Documentation/security/landlock.rst        | 151 +++++++++++++-
>  Documentation/userspace-api/landlock.rst   | 216 +++++++++++++++++++--
>  3 files changed, 367 insertions(+), 19 deletions(-)
> 
> diff --git a/Documentation/admin-guide/LSM/landlock.rst b/Documentation/admin-guide/LSM/landlock.rst
> index 9923874e2156..58ac5ae2f5f3 100644
> --- a/Documentation/admin-guide/LSM/landlock.rst
> +++ b/Documentation/admin-guide/LSM/landlock.rst
> @@ -6,7 +6,7 @@ Landlock: system-wide management
>  ================================
>  
>  :Author: Mickaël Salaün
> -:Date: January 2026
> +:Date: May 2026
>  
>  Landlock can leverage the audit framework to log events.
>  
> @@ -59,14 +59,25 @@ AUDIT_LANDLOCK_ACCESS
>          - scope.abstract_unix_socket - Abstract UNIX socket connection denied
>          - scope.signal - Signal sending denied
>  
> +    **perm.*** - Permission restrictions (ABI 10+):
> +        - perm.namespace_use - Namespace entry was denied (creation via
> +          :manpage:`unshare(2)` / :manpage:`clone(2)` or joining via
> +          :manpage:`setns(2)`);
> +          ``namespace_type`` indicates the type (hex CLONE_NEW* bitmask),
> +          ``namespace_id`` identifies the target namespace for
> +          :manpage:`setns(2)` operations
> +        - perm.capability_use - Capability use was denied;
> +          ``capability`` indicates the capability number
> +
>      Multiple blockers can appear in a single event (comma-separated) when
>      multiple access rights are missing. For example, creating a regular file
>      in a directory that lacks both ``make_reg`` and ``refer`` rights would show
>      ``blockers=fs.make_reg,fs.refer``.
>  
> -    The object identification fields (path, dev, ino for filesystem; opid,
> -    ocomm for signals) depend on the type of access being blocked and provide
> -    context about what resource was involved in the denial.
> +    The object identification fields depend on the type of access being blocked:
> +    ``path``, ``dev``, ``ino`` for filesystem; ``opid``, ``ocomm`` for signals;
> +    ``namespace_type`` and ``namespace_id`` for namespace operations;
> +    ``capability`` for capability use.
>  
>  
>  AUDIT_LANDLOCK_DOMAIN
> diff --git a/Documentation/security/landlock.rst b/Documentation/security/landlock.rst
> index c5186526e76f..2b6e4be42893 100644
> --- a/Documentation/security/landlock.rst
> +++ b/Documentation/security/landlock.rst
> @@ -7,7 +7,7 @@ Landlock LSM: kernel documentation
>  ==================================
>  
>  :Author: Mickaël Salaün
> -:Date: March 2026
> +:Date: May 2026
>  
>  Landlock's goal is to create scoped access-control (i.e. sandboxing).  To
>  harden a whole system, this feature should be available to any process,
> @@ -129,6 +129,143 @@ The reasoning is:
>    restrictions, because access within the same scope is already
>    allowed based on ``LANDLOCK_ACCESS_FS_RESOLVE_UNIX``.
>  
> +Composability with user namespaces
> +----------------------------------
> +
> +Landlock domain-based scoping and the kernel's user namespace-based capability
> +scoping enforce isolation over independent hierarchies.

Minor grammatical nit: "user namespace-based" is a bit hard to read
because it reads like (user) (namespace-based), where it should be
reading as (user namespace)-(based).

In my understanding after digging around, I believe the recommended
approach is to use "user-namespace-based", or em-dashes, or simply
rephrase it ("the kernel's capability scoping based on user
namespaces").

Reference (6th question):
https://www.chicagomanualofstyle.org/qanda/data/faq/topics/HyphensEnDashesEmDashes.html#:~:text=But%20%E2%80%9Ctime%20clock%E2%80%9D%20is%20an%20open%20compound%2C%20so%20this%20seems%20contradictory


> +Landlock checks domain
> +ancestry; the kernel's ``ns_capable()`` checks user namespace ancestry.  These
> +hierarchies are orthogonal: Landlock enforcement is deterministic with respect
> +to its own configuration, regardless of namespace or capability state, and vice
> +versa.  This orthogonality is a design invariant that must hold for all Landlock
> +access controls.
> +
> +Design philosophy
> +-----------------
> +
> +Landlock's goal is to restrict a sandboxed process's access to three kinds of
> +resources: data (files, sockets, pipes), other processes (signals, ptrace), and
> +kernel-internal resources whose use widens the kernel attack surface
> +(capabilities, namespace types).  Each access right or permission gates one or
> +more operations that grant such access; restricting the operations is how
> +Landlock restricts the underlying access.
> +
> +When designing a new access control, identify the protected resource kind
> +first (data, processes, or kernel-internal resources).  The operation set
> +follows from the protected resource: which kernel paths grant access to it, and
> +at which moment those paths can be gated.

Minor grammatical suggestion (a bit more verbose but maybe clearer):

  The operations to restrict follow from the protected resource,
  by identifying which kernel code paths grant access to the resource
  and at which place in the code the access to the resource can be gated.


> +Do not design a permission around
> +"restrict the unshare(2) syscall" or similar mechanism-centric framings; design
> +it around "restrict the process from acquiring access to namespace types" (the
> +protected resource), letting the operation set follow.

I like the rewritten "design philosophy" section, this is much clearer
than in V1. :)


> +Ruleset restriction models
> +--------------------------
> +
> +Landlock provides three restriction models that differ in how rules identify the
> +resource being restricted.

Maybe add two paragraphs here to explain the commonalities as well,
e.g.

  In general, the ``struct landlock_ruleset_attr`` specifies the
  operations to be denied by default under the enforced policy.

  The *rules* added to the ruleset define the exceptions to these
  restrictions, allow-listing specific conditions under which these
  operations are still permitted.


> +Per-object access rights (``handled_access_*``)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Per-object access rights control operations on a specific resource instance,
> +identified in the rule key by a value drawn from an open-ended space: a file
> +hierarchy referenced by ``parent_fd``, or a network port identified by its
> +16-bit number.

(New paragraph here?)

> + Each ``handled_access_*`` field declares a set of access rights
> +that the ruleset restricts.

Minor suggestion:

  Each ``handled_access_*`` field declares a set of access rights,
  operations which are to be denied by default once the ruleset is enforced.

(New paragraph here?)

> +The rule body declares which of the multiple
> +distinct operations on that object instance are allowed (open, read, write,
> +truncate; bind, connect).

> +New operations on an existing rule type extend the
> +corresponding ``handled_access_*`` field (e.g. a new filesystem operation
> +extends ``handled_access_fs``).  A new object type with multiple fine-grained
> +operations would use a new ``handled_access_*`` field.

Suggestion:

  Operations are grouped by object type in the respective
  ``handled_access_*`` field.  When a future version of Landlock
  introduces a new operation for an existing object type, it is added
  to the existing ``handled_access_*`` field for that object type.
  When Landlock adds a new object type, a new ``handled_access_*``
  field for that object type is added.

> +
> +Per-category permissions (``handled_perm``)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Per-category permissions control the process's exercise of category members,
> +where the category is a small kernel-defined enumeration (a Linux capability
> +number ``CAP_*``, a namespace type ``CLONE_NEW*``).  Unlike per-object access
> +rights, which restrict specific operations on a single resource instance,
> +per-category permissions gate the prerequisite operation itself (exercising a
> +capability, acquiring a namespace), so gating it transitively covers a broad set
               ^^^^^^^^^
               "entering"?

> +of downstream operations.

(New paragraph here?)


> +These category members are the LSM-level
> +access-control objects (the entities the process is authorized against) even
> +though they are enum values rather than externally-instantiated kernel data
> +structures.  Per-category permissions apply where the controlled operation
> +collapses to "may the process use this category member at all" (use a
> +capability; acquire a namespace), so the rule body lists which category members
> +the process may exercise; each ``LANDLOCK_PERM_*`` flag maps to its own rule
> +type and covers every kernel path that exercises a member.  When a ruleset
> +handles a permission, all uses of category members are denied unless explicitly
> +allowed by a rule.

Nit: It feels that "Each LANDLOCK_PERM_* flag maps to its own rule
type" is one of the most important sentences here, and I'd maybe move
that at the beginning of a paragraph to make it a bit more prominent.

(New paragraph here?)

> +See Documentation/userspace-api/landlock.rst for the
> +concrete syscall paths covered by each permission.

> +
> +The category enum is owned by the corresponding kernel subsystem (capabilities,
> +namespaces, etc.).  Userspace policy authors query category member availability
> +via the relevant non-Landlock interfaces:
> +
> +* For capabilities: ``<linux/capability.h>``,
> +  ``/proc/sys/kernel/cap_last_cap``, ``prctl(PR_CAPBSET_READ)``.
> +* For namespaces: ``<linux/sched.h>``, ``/proc/$$/ns/*``,
> +  :manpage:`unshare(2)` runtime probe.
> +
> +The Landlock ABI version does not encode this availability; ABI versioning
> +describes which Landlock features (rule types, access rights, scopes,
> +permissions) the kernel implements, not which category members the kernel knows
> +about.
> +
> +Forward compatibility for new category members follows a simple rule set:
> +
> +* New members in future kernels are automatically denied: rules whitelist
> +  specific values, and a member not in any rule is denied.
> +* Kernel-side compatibility for split categories is handled by the owning
> +  subsystem (e.g., when ``CAP_BPF`` was split from ``CAP_SYS_ADMIN``, the
> +  kernel kept checking either capability, so a rule denying ``CAP_SYS_ADMIN``
> +  continues to deny operations gated by ``CAP_SYS_ADMIN || CAP_BPF`` patterns).

This is not clear to me; a rule is not denying anything, because rules
only allow things.  Did you mean to write "a rule allowing
CAP_SYS_ADMIN continues to allow operations gated by "CAP_SYS_ADMIN ||
CAP_BPF"?

After CAP_BPF was split off of CAP_SYS_ADMIN, either one of these two
capabilities is now sufficient for the operation guarded by it.

> +* Unknown values in the rule body are silently accepted rather than rejected.
> +  Rejecting them would tie Landlock policy semantics to the running kernel's
> +  category-member set: a rule built against future headers would fail to load
> +  on older kernels, forcing policy authors to know each kernel's enumeration.
> +  Acceptance is fail-safe in both directions: a rule referring to a value the
> +  running kernel does not yet know has no effect (deny-by-default still applies
> +  to that operation), and a rule written against future headers loads
> +  identically across kernels so the same policy keeps the same restrictions.
> +  When a value becomes real on a future kernel, the policy activates as written
> +  by the author.
> +* In contrast, unknown ``LANDLOCK_PERM_*`` flags in ``handled_perm`` are
> +  rejected (``-EINVAL``), since Landlock owns that bit space.
> +
> +Cross-domain scopes (``scoped``)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Scopes restrict **cross-domain interactions** categorically, without rules.
> +Setting a scope flag (e.g.  ``LANDLOCK_SCOPE_SIGNAL``) denies the operation to
> +targets outside the Landlock domain or its children.  Like per-category
> +permissions, scopes provide complete coverage of the controlled operation.
> +
> +Choosing a model for a new feature
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +* If the new feature controls operations on resource objects supplied by the
> +  sandbox author, extend or add a per-object access right
> +  (``handled_access_*``).
> +* If the new feature controls a per-category operation gated by an enum (a
> +  Linux capability, a namespace type, a socket family, etc.), use a
> +  per-category permission (``handled_perm``).  When several such enums could
> +  classify the operation, prefer the enum the originating subsystem already
> +  uses for capability/access checks (e.g. ``CAP_*`` for ``capable()`` hooks,
> +  ``CLONE_NEW*`` for namespace hooks).
> +* When an operation is gated by multiple kernel-defined enums (a classic
> +  example being ``CAP_SYS_ADMIN`` plus a ``CLONE_NEW*`` flag for non-user
> +  namespace creation), define one per-category permission per enum dimension.
> +  Sandbox authors handle each dimension's permission in ``handled_perm`` and
> +  add rules for each; the kernel enforces each dimension at its own LSM hook.
> +  ``LANDLOCK_PERM_NAMESPACE_USE`` and ``LANDLOCK_PERM_CAPABILITY_USE`` follow
> +  this pattern.
> +* If the new feature restricts a categorical cross-domain interaction with no
> +  per-target granularity, use a cross-domain scope (``scoped``).
> +* For all three models, confirm a single LSM hook (or small set of related
> +  hooks) covers every kernel path that exercises the operation.
> +
>  Tests
>  =====
>  
> @@ -150,6 +287,18 @@ Filesystem
>  .. kernel-doc:: security/landlock/fs.h
>      :identifiers:
>  
> +Namespace
> +---------
> +
> +.. kernel-doc:: security/landlock/ns.h
> +    :identifiers:
> +
> +Capability
> +----------
> +
> +.. kernel-doc:: security/landlock/cap.h
> +    :identifiers:
> +
>  Process credential
>  ------------------
>  
> diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
> index 45861fa75685..45548d1666fa 100644
> --- a/Documentation/userspace-api/landlock.rst
> +++ b/Documentation/userspace-api/landlock.rst
> @@ -29,20 +29,29 @@ If Landlock is not currently supported, we need to
>  Landlock rules
>  ==============
>  
> -A Landlock rule describes an action on an object which the process intends to
> -perform.  A set of rules is aggregated in a ruleset, which can then restrict
> -the thread enforcing it, and its future children.
> +A Landlock rule describes the actions a process is allowed to perform on a
> +specific resource.  A set of rules is aggregated in a ruleset, which can then
> +restrict the thread enforcing it, and its future children.
>  
> -The two existing types of rules are:
> +The existing types of rules are:
>  
>  Filesystem rules
> -    For these rules, the object is a file hierarchy,
> -    and the related filesystem actions are defined with
> -    `filesystem access rights`.
> +    The rule key is a file hierarchy, and the actions it allows are
> +    defined with `filesystem access rights`.
>  
>  Network rules (since ABI v4)
> -    For these rules, the object is a TCP port,
> -    and the related actions are defined with `network access rights`.
> +    The rule key is a TCP port, and the actions it allows are defined with
> +    `network access rights`.
> +
> +Capability rules (since ABI v10)
> +    The rule body lists which members of the Linux capability category
> +    the process may exercise; the action is defined with `permission
> +    flags`.

Suggestion:

  The rule body lists which Linux capabilities the process may
  exercise; ...

(The notion of "category" was introduced in the design rationale,
and would probably confuse me if I hadn't read that first.)

> +
> +Namespace rules (since ABI v10)
> +    The rule body lists which members of the namespace-type
> +    category the process may use; the action is defined with `permission
> +    flags`.

Similar here:

  The rule body lists which namespace types the process may use; ...

Should it say "...the process may *enter*" instead?  I noticed that
you renamed the LANDLOCK_PERM_NAMESPACE_USE enum, but it's still about
*entering* these namespaces, right?  In a sense, a process is *using*
each of these namespace types also during normal user lookup, file
lookup etc, and that is all not restricted here.


>  Defining and enforcing a security policy
>  ----------------------------------------
> @@ -85,6 +94,9 @@ to be explicit about the denied-by-default access rights.
>          .scoped =
>              LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
>              LANDLOCK_SCOPE_SIGNAL,
> +        .handled_perm =
> +            LANDLOCK_PERM_CAPABILITY_USE |
> +            LANDLOCK_PERM_NAMESPACE_USE,
>      };
>  
>  Because we may not know which kernel version an application will be executed
> @@ -132,6 +144,11 @@ version, and only use the available subset of access rights:
>      case 6 ... 8:
>          /* Removes LANDLOCK_ACCESS_FS_RESOLVE_UNIX for ABI < 9 */
>          ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_RESOLVE_UNIX;
> +        __attribute__((fallthrough));
> +    case 9:
> +        /* Removes LANDLOCK_PERM_* for ABI < 10 */
> +        ruleset_attr.handled_perm &= ~(LANDLOCK_PERM_NAMESPACE_USE |
> +                                       LANDLOCK_PERM_CAPABILITY_USE);
>      }
>  
>  This enables the creation of an inclusive ruleset that will contain our rules.
> @@ -202,6 +219,53 @@ number for a specific action: HTTPS connections.
>          err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
>                                  &net_port, 0);
>  
> +Capability and namespace rules use a different attribute layout:
> +``allowed_perm`` identifies the permission category (a single
> +``LANDLOCK_PERM_*`` flag) and a type-specific value field carries the bitmask to
> +allow within it.  See `Capability and namespace restrictions`_ for the model.
> +
> +For capability access-control, we can add rules that allow specific
> +capabilities.  For instance, to allow ``CAP_SYS_CHROOT`` (so the sandboxed
> +process can call :manpage:`chroot(2)` inside a user namespace):
> +
> +.. code-block:: c
> +
> +    struct landlock_capability_attr cap_attr = {
> +        .allowed_perm = LANDLOCK_PERM_CAPABILITY_USE,
> +        .capabilities = (1ULL << CAP_SYS_CHROOT),
> +    };
> +
> +    cap_attr.allowed_perm &= ruleset_attr.handled_perm;
> +    if (cap_attr.allowed_perm)
> +        err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_CAPABILITY,
> +                                &cap_attr, 0);

I would suggest to cross-reference the capabilities(7) man page in
this section, which lists the available CAP_* enum values.

> +
> +For namespace access-control, we can add rules that allow entering specific
> +namespace types (creating them via :manpage:`unshare(2)` / :manpage:`clone(2)` /
> +:manpage:`clone3(2)`, joining them via :manpage:`setns(2)`, or acquiring an fd
> +reference via :manpage:`open_tree(2)` / :manpage:`fsmount(2)`).  For instance,
> +to allow creating user namespaces (which grants all capabilities inside the new
> +namespace):
> +
> +.. code-block:: c
> +
> +    struct landlock_namespace_attr ns_attr = {
> +        .allowed_perm = LANDLOCK_PERM_NAMESPACE_USE,
> +        .namespace_types = CLONE_NEWUSER,
> +    };
> +
> +    ns_attr.allowed_perm &= ruleset_attr.handled_perm;
> +    if (ns_attr.allowed_perm)
> +        err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NAMESPACE,
> +                                &ns_attr, 0);

Likewise cross-reference namespaces(7) in this section, as a reference
for the available CLONE_* enum values?


> +Together, these two rules allow an unprivileged process to create a user
> +namespace and call :manpage:`chroot(2)` inside it, while denying all other
> +capabilities and namespace types.  User namespace creation is the one operation
> +that does not require ``CAP_SYS_ADMIN``, so no capability rule is needed for it.
> +See `Capability and namespace restrictions`_ for details on capability
> +requirements.
> +
>  When passing a non-zero ``flags`` argument to ``landlock_restrict_self()``, a
>  similar backwards compatibility check is needed for the restrict flags
>  (see sys_landlock_restrict_self() documentation for available flags):
> @@ -380,9 +444,115 @@ The operations which can be scoped are:
>      A :manpage:`sendto(2)` on a socket which was previously connected will not
>      be restricted.  This works for both datagram and stream sockets.
>  
> -IPC scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
> -If an operation is scoped within a domain, no rules can be added to allow access
> -to resources or processes outside of the scope.
> +Scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.  If an
> +operation is scoped within a domain, no rules can be added to allow access to
> +resources or processes outside of the scope.
> +
> +Capability and namespace restrictions
> +-------------------------------------
> +
> +``handled_perm`` declares per-category permissions: each permission selects
> +which members of a kernel-defined category (CAP_* capabilities, CLONE_NEW*
> +namespace types) the process may use.  Unlike per-object access rights
> +(``handled_access_*``) or cross-domain scopes (``scoped``), per-category
> +permissions constrain the sandboxed process's own use of these enums; members
> +not allowed by a rule are denied by default.
> +
> +``LANDLOCK_PERM_NAMESPACE_USE`` gates *acquisition* of namespace
> +associations:

"*acquisition of access* to namespaces"?

In my understanding, it is not just "entering", which would make the
NS ambiently available to a process, but also the implicit acquisition
of a new namespace as it is happening under the hood for open_tree(2)?

> +creation via :manpage:`unshare(2)` / :manpage:`clone(2)`
> +/ :manpage:`clone3(2)`, entry via :manpage:`setns(2)`, and fd-reference
> +acquisition via :manpage:`open_tree(2)` / :manpage:`fsmount(2)`.  Namespaces
> +the process is already a member of when the domain is enforced are implicitly
> +allowed (the process could not continue running otherwise); rules describe which
> +new namespace types the process may acquire.  ``LANDLOCK_PERM_CAPABILITY_USE``
> +gates every exercise of a capability after the domain is enforced, regardless
> +of how the capability was obtained (inherited credentials, ``CLONE_NEWUSER``
> +grant, ``setuid``/file-cap-bearing :manpage:`execve(2)`, etc.).  Configuring
> +both together restricts what privileges are available *and* the namespaces in
> +which they take effect, which matters because user namespace creation has no
> +capability check and grants all capabilities within the new namespace: gating
> +only one of the two leaves a kernel attack-surface widening path open.
> +
> +``LANDLOCK_PERM_CAPABILITY_USE`` complements :manpage:`prctl(2)`
> +``PR_SET_NO_NEW_PRIVS`` but does not replace it.  ``PR_SET_NO_NEW_PRIVS``
> +prevents privilege *acquisition* via :manpage:`execve(2)` (setuid, file
> +capability xattrs, privilege-elevating LSM transitions) and is a prerequisite
> +for unprivileged Landlock self-sandboxing.  ``LANDLOCK_PERM_CAPABILITY_USE``
> +restricts *exercise* of capabilities the process already holds, including those
> +gained via ``CLONE_NEWUSER`` which ``PR_SET_NO_NEW_PRIVS`` does not block.
> +Sandboxes typically set both.
> +
> +Rules are added with ``LANDLOCK_RULE_CAPABILITY`` and &struct
> +landlock_capability_attr (each rule lists ``CAP_*`` values to allow), and with
> +``LANDLOCK_RULE_NAMESPACE`` and &struct landlock_namespace_attr (each rule
> +lists ``CLONE_NEW*`` flags to allow).  Landlock is purely restrictive: it can
> +only deny what the traditional check would have allowed, never grant additional
> +privileges.
> +
> +Rule bodies silently accept values unknown to the current kernel (capabilities
> +above ``CAP_LAST_CAP``, unrecognised ``CLONE_NEW*`` bits): they have no runtime
> +effect, so a rule compiled against future kernel headers loads without error on
> +older kernels.  Future kernels gain new members denied by default until a rule
> +explicitly allows them.
> +
> +The single ``LANDLOCK_PERM_NAMESPACE_USE`` bit gates every kernel path that
> +grants the calling process access to a namespace of the controlled types,
> +whether by becoming a member of the namespace or by holding a file descriptor
> +that references it.  The covered syscall paths are:
> +
> +* :manpage:`unshare(2)` with ``CLONE_NEW*``: the caller becomes a member of a
> +  newly-created namespace.
> +* :manpage:`clone(2)` (or :manpage:`clone3(2)`) with ``CLONE_NEW*``: the
> +  child becomes a member of a newly-created namespace.
> +* :manpage:`setns(2)`: the caller becomes a member of an existing namespace
> +  referenced by file descriptor.
> +* :manpage:`open_tree(2)` with ``OPEN_TREE_NAMESPACE``: the caller obtains a
> +  file descriptor referring to a newly-created mount namespace.

(OPEN_TREE_NAMESPACE is not documented in the man page so far.
Friendly nudge, Christian. :-))

> +* :manpage:`open_tree(2)` with ``OPEN_TREE_CLONE``: the caller obtains a file
> +  descriptor referring to a newly-created anonymous mount namespace.
> +* :manpage:`fsmount(2)` with ``FSMOUNT_NAMESPACE``: the caller obtains a file
> +  descriptor referring to a newly-created mount namespace.

(Ditto, it's not in the manpage; it's only getting introduced in 7.1,
so I hope it will eventually still end up there.)


> +* :manpage:`fsmount(2)` (default): the caller obtains a file descriptor
> +  referring to a newly-created anonymous mount namespace.
> +
> +Anonymous mount namespaces (created by ``open_tree(OPEN_TREE_CLONE)`` and the
> +default :manpage:`fsmount(2)`) are intentionally covered by the bit even though
> +the calling process does not become a member of them.  Without this coverage, a
> +sandboxed process could combine ``open_tree(OPEN_TREE_CLONE)`` with
> +:manpage:`move_mount(2)` to graft mounts from a freshly-allocated mount
> +namespace into its current namespace, bypassing the policy.
> +
> +In practice, unprivileged processes first create a user namespace (which
> +requires no capability and grants all capabilities within it), then use those
> +capabilities to create other namespace types.  All non-user namespace types
> +require ``CAP_SYS_ADMIN`` for both creation and :manpage:`setns(2)` entry; mount
> +namespace entry additionally requires ``CAP_SYS_CHROOT``.  For
> +:manpage:`setns(2)`, capabilities are checked relative to the target namespace,
> +so a process in an ancestor user namespace naturally satisfies them; this
> +includes joining user namespaces, which requires ``CAP_SYS_ADMIN``.  When
> +``LANDLOCK_PERM_CAPABILITY_USE`` is also handled, each of these capabilities
> +must be explicitly allowed by a rule.
> +
> +When combining ``CLONE_NEWUSER`` with other ``CLONE_NEW*`` flags in a single
> +:manpage:`unshare(2)` call, the ``CAP_SYS_ADMIN`` check targets the newly
> +created user namespace, which is handled by ``LANDLOCK_PERM_NAMESPACE_USE``
> +independently from ``LANDLOCK_PERM_CAPABILITY_USE``.  Performing the user
> +namespace creation and the additional namespace creation in two separate
> +:manpage:`unshare(2)` calls requires a rule allowing ``CAP_SYS_ADMIN`` if the
> +domain also handles ``LANDLOCK_PERM_CAPABILITY_USE``.
> +
> +When creating child user namespaces, it is recommended to also create a
> +dedicated Landlock domain with restrictions relevant to each namespace context.
> +
> +Note that ``LANDLOCK_PERM_CAPABILITY_USE`` restricts the *use* of capabilities,
> +not their presence in the process's credential.  Capability sets can change
> +after a domain is enforced through user namespace entry or :manpage:`capset(2)`;
> +privileged sandboxes that did not set ``PR_SET_NO_NEW_PRIVS`` may also gain
> +capabilities through :manpage:`execve(2)` of binaries with file capabilities.
> +In all cases, :manpage:`capget(2)` will report the credential's capability sets,
> +but any denied capability will fail with ``EPERM`` when exercised.  Do not rely
> +on :manpage:`capget(2)` to determine whether the policy permits a given
> +capability; only the actual operation will return ``EPERM`` upon denial.
>  
>  Truncating files
>  ----------------
> @@ -545,7 +715,7 @@ Access rights
>  -------------
>  
>  .. kernel-doc:: include/uapi/linux/landlock.h
> -    :identifiers: fs_access net_access scope
> +    :identifiers: fs_access net_access scope perm
>  
>  Creating a new ruleset
>  ----------------------
> @@ -564,7 +734,8 @@ Extending a ruleset
>  
>  .. kernel-doc:: include/uapi/linux/landlock.h
>      :identifiers: landlock_rule_type landlock_path_beneath_attr
> -                  landlock_net_port_attr
> +                  landlock_net_port_attr landlock_capability_attr
> +                  landlock_namespace_attr
>  
>  Enforcing a ruleset
>  -------------------
> @@ -722,6 +893,23 @@ Starting with the Landlock ABI version 9, it is possible to restrict
>  connections to pathname UNIX domain sockets (:manpage:`unix(7)`) using
>  the new ``LANDLOCK_ACCESS_FS_RESOLVE_UNIX`` right.
>  
> +Capability restriction (ABI < 10)
> +---------------------------------
> +
> +Starting with the Landlock ABI version 10, it is possible to restrict
> +:manpage:`capabilities(7)` with the new ``LANDLOCK_PERM_CAPABILITY_USE``
> +permission flag and ``LANDLOCK_RULE_CAPABILITY`` rule type.
> +
> +Namespace restriction (ABI < 10)
> +--------------------------------
> +
> +Starting with the Landlock ABI version 10, it is possible to restrict namespace
> +use across creation (:manpage:`unshare(2)`, :manpage:`clone(2)`,
> +:manpage:`clone3(2)`), entry (:manpage:`setns(2)`), and fd-reference acquisition
> +(:manpage:`open_tree(2)`, :manpage:`fsmount(2)`) with the new
> +``LANDLOCK_PERM_NAMESPACE_USE`` permission flag and ``LANDLOCK_RULE_NAMESPACE``
> +rule type.

This section would also benefit from a link to namespaces(7),
which documents the list of different namespaces.

> +
>  .. _kernel_support:
>  
>  Kernel support
> -- 
> 2.54.0
> 

Overall, I have a fair amount of remarks here, but most of them are
much more on the "suggestion" side -- this documentation is much
clearer than in V1, IMHO. :)

–Günther

^ permalink raw reply

* Re: [PATCH v4 3/3] tpm: tpm_crb_ffa: revert defered_probed when tpm_crb_ffa is built-in
From: Sudeep Holla @ 2026-06-01  8:54 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: Jarkko Sakkinen, Sudeep Holla, linux-security-module,
	linux-kernel, linux-integrity, paul, zohar, roberto.sassu,
	noodles, jmorris, serge, dmitry.kasatkin, eric.snowberg, jgg
In-Reply-To: <ah0x+YDypYFzpFqt@e129823.arm.com>

On Mon, Jun 01, 2026 at 08:17:13AM +0100, Yeoreum Yun wrote:
> Hi Jarkko,
> 
> Sorry for late answer.
> 
> > On Mon, May 25, 2026 at 08:54:04AM +0100, Yeoreum Yun wrote:
> > > commit 746d9e9f62a6 ("tpm: tpm_crb_ffa: try to probe tpm_crb_ffa when it's build_in")
> > > probe tpm_crb_ffa forcefully when it's built-in to integrate with IMA.
> > > 
> > > However, IMA now provides the IMA_INIT_LATE_SYNC build option, which
> > > initialises IMA at the late_initcall_sync level, so this change is no
> > > longer required.
> > > 
> > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > > ---
> > >  drivers/char/tpm/tpm_crb_ffa.c | 18 +++---------------
> > >  1 file changed, 3 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/char/tpm/tpm_crb_ffa.c b/drivers/char/tpm/tpm_crb_ffa.c
> > > index 99f1c1e5644b..025c4d4b17ca 100644
> > > --- a/drivers/char/tpm/tpm_crb_ffa.c
> > > +++ b/drivers/char/tpm/tpm_crb_ffa.c
> > > @@ -177,23 +177,13 @@ static int tpm_crb_ffa_to_linux_errno(int errno)
> > >   */
> > >  int tpm_crb_ffa_init(void)
> > >  {
> > > -	int ret = 0;
> > > -
> > > -	if (!IS_MODULE(CONFIG_TCG_ARM_CRB_FFA)) {
> > > -		ret = ffa_register(&tpm_crb_ffa_driver);
> > > -		if (ret) {
> > > -			tpm_crb_ffa = ERR_PTR(-ENODEV);
> > > -			return ret;
> > > -		}
> > > -	}
> > > -
> > >  	if (!tpm_crb_ffa)
> > > -		ret = -ENOENT;
> > > +		return -ENOENT;
> > >  
> > >  	if (IS_ERR_VALUE(tpm_crb_ffa))
> > > -		ret = -ENODEV;
> > > +		return -ENODEV;
> > >  
> > > -	return ret;
> > > +	return 0;
> > >  }
> > >  EXPORT_SYMBOL_GPL(tpm_crb_ffa_init);
> > >  
> > > @@ -405,9 +395,7 @@ static struct ffa_driver tpm_crb_ffa_driver = {
> > >  	.id_table = tpm_crb_ffa_device_id,
> > >  };
> > >  
> > > -#ifdef MODULE
> > >  module_ffa_driver(tpm_crb_ffa_driver);
> > > -#endif
> > >  
> > >  MODULE_AUTHOR("Arm");
> > >  MODULE_DESCRIPTION("TPM CRB FFA driver");
> > > -- 
> > > LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
> > > 
> > 
> > How we would sync up this patch? Through which tree etc.
> 
> IMHO, the IMA relevant thing would be into IMA tree,
> However I think this patch would be much easier to sync into Sudeep's
> FF-A tree where ff-a initilisation is reverted to device_initcall
> unless you're uncomfortable.
> 
> For this, It might be better to split this patch from this series
> since by above and defer probe of ff-a would make a register failure
> of registering tpm_crb_ffa driver which is built-in.
> 
> @Sudeep what do you think?
> 

IIRC, there is/was no dependency between these and FF-A patches that are
queued in terms of build. I agree there may be dependency to get all the
functionality but we can resort to linux-next for that. FF-A is not enabled
in the defconfig, so anyone working on FF-A + TPM must enable then and can
rely on -next IMHO.

That said, I have already sent PR for FF-A to SoC team and it is already
queued for v7.2. I don't have any other plans unless they are fixes.

-- 
Regards,
Sudeep

^ permalink raw reply

* Re: [PATCH v4 3/3] tpm: tpm_crb_ffa: revert defered_probed when tpm_crb_ffa is built-in
From: Yeoreum Yun @ 2026-06-01  7:17 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-security-module, linux-kernel, linux-integrity, paul, zohar,
	roberto.sassu, noodles, sudeep.holla, jmorris, serge,
	dmitry.kasatkin, eric.snowberg, jgg
In-Reply-To: <ahoXUjbsPmKxfV_R@kernel.org>

Hi Jarkko,

Sorry for late answer.

> On Mon, May 25, 2026 at 08:54:04AM +0100, Yeoreum Yun wrote:
> > commit 746d9e9f62a6 ("tpm: tpm_crb_ffa: try to probe tpm_crb_ffa when it's build_in")
> > probe tpm_crb_ffa forcefully when it's built-in to integrate with IMA.
> > 
> > However, IMA now provides the IMA_INIT_LATE_SYNC build option, which
> > initialises IMA at the late_initcall_sync level, so this change is no
> > longer required.
> > 
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > ---
> >  drivers/char/tpm/tpm_crb_ffa.c | 18 +++---------------
> >  1 file changed, 3 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/char/tpm/tpm_crb_ffa.c b/drivers/char/tpm/tpm_crb_ffa.c
> > index 99f1c1e5644b..025c4d4b17ca 100644
> > --- a/drivers/char/tpm/tpm_crb_ffa.c
> > +++ b/drivers/char/tpm/tpm_crb_ffa.c
> > @@ -177,23 +177,13 @@ static int tpm_crb_ffa_to_linux_errno(int errno)
> >   */
> >  int tpm_crb_ffa_init(void)
> >  {
> > -	int ret = 0;
> > -
> > -	if (!IS_MODULE(CONFIG_TCG_ARM_CRB_FFA)) {
> > -		ret = ffa_register(&tpm_crb_ffa_driver);
> > -		if (ret) {
> > -			tpm_crb_ffa = ERR_PTR(-ENODEV);
> > -			return ret;
> > -		}
> > -	}
> > -
> >  	if (!tpm_crb_ffa)
> > -		ret = -ENOENT;
> > +		return -ENOENT;
> >  
> >  	if (IS_ERR_VALUE(tpm_crb_ffa))
> > -		ret = -ENODEV;
> > +		return -ENODEV;
> >  
> > -	return ret;
> > +	return 0;
> >  }
> >  EXPORT_SYMBOL_GPL(tpm_crb_ffa_init);
> >  
> > @@ -405,9 +395,7 @@ static struct ffa_driver tpm_crb_ffa_driver = {
> >  	.id_table = tpm_crb_ffa_device_id,
> >  };
> >  
> > -#ifdef MODULE
> >  module_ffa_driver(tpm_crb_ffa_driver);
> > -#endif
> >  
> >  MODULE_AUTHOR("Arm");
> >  MODULE_DESCRIPTION("TPM CRB FFA driver");
> > -- 
> > LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
> > 
> 
> How we would sync up this patch? Through which tree etc.

IMHO, the IMA relevant thing would be into IMA tree,
However I think this patch would be much easier to sync into Sudeep's
FF-A tree where ff-a initilisation is reverted to device_initcall
unless you're uncomfortable.

For this, It might be better to split this patch from this series
since by above and defer probe of ff-a would make a register failure
of registering tpm_crb_ffa driver which is built-in.

@Sudeep what do you think?

Link: https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux.git/commit/?h=for-next/ffa/updates&id=cc7e8f21b9f0c229d68cf19a837cba82b5ac2d87 [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux.git/commit/?h=for-next/ffa/updates&id=e659fc8e537c7a21d5d693d6f30d8852f2fa8d91 [1]

-- 
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [PATCH] fork: Ensure copy_process() returns a valid error pointer on failure
From: Shijia Hu @ 2026-06-01  6:33 UTC (permalink / raw)
  To: alexei.starovoitov
  Cc: 2022090917019, M202472210, akpm, andrii, ast, bpf, daniel, david,
	dddddd, hushijia1, kees, kernel, linux-kernel, linux-mm,
	linux-security-module, paul
In-Reply-To: <CAADnVQ+=-UM-JC4eM=vqvgK2tLt6PwmDjOcrrDG9kz8BV6n49Q@mail.gmail.com>

On Sun, 31 May 2026 20:58:49 -0700, Alexei Starovoitov wrote:
> This was reported earlier and there is a fix in the works.
> This approach is incorrect.
> You have to fix the root cause, not the symptom.
>
> pw-bot: cr

Thanks for taking a look.

I'll drop this copy_process() approach for upstream.
 
Is the follow-up fix still being tracked in this thread?

  https://lore.kernel.org/all/20260411163556.8567-1-yangfeng59949@163.com/

If so, I can follow the progress there and help test the fix with the
security_* fmod_ret reproducer once it is available.

Thanks,
Shijia

^ permalink raw reply

* Re: [PATCH] fork: Ensure copy_process() returns a valid error pointer on failure
From: Alexei Starovoitov @ 2026-06-01  3:58 UTC (permalink / raw)
  To: Shijia Hu
  Cc: Andrew Morton, David Hildenbrand (Arm), Kees Cook, Paul Moore,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, LKML,
	linux-mm, LSM List, bpf, stable, kernel, Quan Sun, Yinhao Hu,
	Kaiyan Mei
In-Reply-To: <20260601030649.2513937-1-hushijia1@uniontech.com>

On Sun, May 31, 2026 at 8:08 PM Shijia Hu <hushijia1@uniontech.com> wrote:
>
> copy_process() returns ERR_PTR(retval) from its error path, so retval
> must be a negative errno in the range [-MAX_ERRNO, -1]. Values outside
> that range produce a pointer which is not caught by IS_ERR() in
> kernel_clone().
>
> This can be triggered by attaching a BPF_MODIFY_RETURN program to
> security_task_alloc() and returning an invalid value. copy_process()
> treats the non-zero return as a failure, but ERR_PTR(1) or
> ERR_PTR(-MAX_ERRNO - 1) does not produce an error pointer recognized by
> IS_ERR(). kernel_clone() may then dereference the returned pointer.
>
> Normalize unexpected values before returning ERR_PTR() from the
> copy_process() error path. This keeps the fix local to the fork error
> handling contract and does not change BPF_MODIFY_RETURN verifier behavior.
>
> Fixes: 6ba43b761c41 ("bpf: Attachment verification for BPF_MODIFY_RETURN")
> Reported-by: Quan Sun <2022090917019@std.uestc.edu.cn>
> Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
> Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
> Closes: https://lore.kernel.org/bpf/973a1b7b-8ee7-407a-890a-11455d9cc5bf@std.uestc.edu.cn/
> Link: https://lore.kernel.org/all/20260411163556.8567-1-yangfeng59949@163.com/
> Cc: stable@vger.kernel.org
> Signed-off-by: Shijia Hu <hushijia1@uniontech.com>
> ---
>  kernel/fork.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 8ac38beae360..40bfbdfffbdc 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -2599,6 +2599,13 @@ __latent_entropy struct task_struct *copy_process(
>         spin_lock_irq(&current->sighand->siglock);
>         hlist_del_init(&delayed.node);
>         spin_unlock_irq(&current->sighand->siglock);
> +       /*
> +        * The error path returns ERR_PTR(retval), which requires retval to be a
> +        * negative errno in the range [-MAX_ERRNO, -1]. Normalize unexpected
> +        * values to avoid returning non-error pointers to callers.
> +        */
> +       if (unlikely(retval >= 0 || retval < -MAX_ERRNO))
> +               retval = -EINVAL;

This was reported earlier and there is a fix in the works.
This approach is incorrect.
You have to fix the root cause, not the symptom.

pw-bot: cr

^ permalink raw reply

* [PATCH] fork: Ensure copy_process() returns a valid error pointer on failure
From: Shijia Hu @ 2026-06-01  3:06 UTC (permalink / raw)
  To: akpm, david, kees
  Cc: paul, ast, andrii, daniel, linux-kernel, linux-mm,
	linux-security-module, bpf, stable, kernel, Shijia Hu, Quan Sun,
	Yinhao Hu, Kaiyan Mei

copy_process() returns ERR_PTR(retval) from its error path, so retval
must be a negative errno in the range [-MAX_ERRNO, -1]. Values outside
that range produce a pointer which is not caught by IS_ERR() in
kernel_clone().

This can be triggered by attaching a BPF_MODIFY_RETURN program to
security_task_alloc() and returning an invalid value. copy_process()
treats the non-zero return as a failure, but ERR_PTR(1) or
ERR_PTR(-MAX_ERRNO - 1) does not produce an error pointer recognized by
IS_ERR(). kernel_clone() may then dereference the returned pointer.

Normalize unexpected values before returning ERR_PTR() from the
copy_process() error path. This keeps the fix local to the fork error
handling contract and does not change BPF_MODIFY_RETURN verifier behavior.

Fixes: 6ba43b761c41 ("bpf: Attachment verification for BPF_MODIFY_RETURN")
Reported-by: Quan Sun <2022090917019@std.uestc.edu.cn>
Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
Closes: https://lore.kernel.org/bpf/973a1b7b-8ee7-407a-890a-11455d9cc5bf@std.uestc.edu.cn/
Link: https://lore.kernel.org/all/20260411163556.8567-1-yangfeng59949@163.com/
Cc: stable@vger.kernel.org
Signed-off-by: Shijia Hu <hushijia1@uniontech.com>
---
 kernel/fork.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/fork.c b/kernel/fork.c
index 8ac38beae360..40bfbdfffbdc 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2599,6 +2599,13 @@ __latent_entropy struct task_struct *copy_process(
 	spin_lock_irq(&current->sighand->siglock);
 	hlist_del_init(&delayed.node);
 	spin_unlock_irq(&current->sighand->siglock);
+	/*
+	 * The error path returns ERR_PTR(retval), which requires retval to be a
+	 * negative errno in the range [-MAX_ERRNO, -1]. Normalize unexpected
+	 * values to avoid returning non-error pointers to callers.
+	 */
+	if (unlikely(retval >= 0 || retval < -MAX_ERRNO))
+		retval = -EINVAL;
 	return ERR_PTR(retval);
 }

-- 
2.20.1

^ permalink raw reply related

* Re: [PATCH] lsm,bpf: fix security_bpf_prog_load() error handling
From: Alexei Starovoitov @ 2026-06-01  1:42 UTC (permalink / raw)
  To: Paul Moore; +Cc: bpf, LSM List
In-Reply-To: <CAADnVQ+JDtc_GdxCv6tWAT03k85PTc0+zdcpCQ6NU-T8yJcu0A@mail.gmail.com>

On Sat, May 23, 2026 at 10:19 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sat, May 23, 2026 at 6:53 PM Paul Moore <paul@paul-moore.com> wrote:
> >
> > On May 23, 2026 11:25:55 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > > On Sat, May 23, 2026 at 6:06 PM Paul Moore <paul@paul-moore.com> wrote:
> > >> On Sat, May 23, 2026 at 12:00 PM Paul Moore <paul@paul-moore.com> wrote:
> > >>>
> > >>> If security_bpf_prog_load() fails there is no need to call into
> > >>> security_bpf_prog_free() as the LSM will handle the cleanup of any partial
> > >>> LSM state before returning to the caller with an error.  Thankfully this
> > >>> isn't an issue with any of the existing code as the LSMs which currently
> > >>> provide BPF hook callback implementations don't allocate any internal
> > >>> state, but this is something we want to fix for potential future users.
> > >>>
> > >>> Cc: bpf@vger.kernel.org
> > >>> Cc: linux-security-module@vger.kernel.org
> > >>> Signed-off-by: Paul Moore <paul@paul-moore.com>
> > >>> ---
> > >>> kernel/bpf/syscall.c | 4 +---
> > >>> 1 file changed, 1 insertion(+), 3 deletions(-)
> > >>
> > >> Alexei, I'm assuming you would prefer to take this via the BPF tree?
> > >
> > > Paul, I see that you're intentionally trying to piss me off.
> > > It's not going to work :)
> >
> > I promise you that is not the case. I was looking at the sashiko results of
> > the latest Hornet patch and it identified this potential issue in the error
> > handling code that is an issue independent of Hornet. I posted the quick
> > little patch above to fix the issue, and since the diffstat is solely in
> > kernel/bpf/syscall.c I figured you would want to merge it via the BPF tree;
> > if that is not the case let me know.
>
> It's in a queue. You can monitor it here:
> https://patchwork.kernel.org/project/netdevbpf/list/?series=&submitter=&state=&q=&archive=&delegate=121173
>
> But please be advised that when submitters ignore issues
> found by bots and maintainers agree with bot findings
> we mark patches as changes requested.

Applied.

^ permalink raw reply

* [PATCH v10 9/9] selftests/landlock: Add tests for invalid use of quiet flag
From: Tingmao Wang @ 2026-06-01  0:00 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, Justin Suess, Jan Kara,
	Abhinav Saxena, linux-security-module
In-Reply-To: <cover.1780272022.git.m@maowtm.org>

Make sure that these calls return EINVAL.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---

Changes in v4:
- New patch

 tools/testing/selftests/landlock/base_test.c | 57 ++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c
index 84e91fcaa1b2..af9ad822a444 100644
--- a/tools/testing/selftests/landlock/base_test.c
+++ b/tools/testing/selftests/landlock/base_test.c
@@ -526,4 +526,61 @@ TEST(cred_transfer)
 	EXPECT_EQ(EACCES, errno);
 }
 
+TEST(useless_quiet_rule)
+{
+	struct landlock_ruleset_attr ruleset_attr = {
+		.handled_access_fs = LANDLOCK_ACCESS_FS_READ_DIR,
+		.quiet_access_fs = 0,
+	};
+	struct landlock_path_beneath_attr path_beneath_attr = {
+		.allowed_access = LANDLOCK_ACCESS_FS_READ_DIR,
+	};
+	int ruleset_fd, root_fd;
+
+	drop_caps(_metadata);
+	ruleset_fd =
+		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+	ASSERT_LE(0, ruleset_fd);
+
+	root_fd = open("/", O_PATH | O_CLOEXEC);
+	ASSERT_LE(0, root_fd);
+	path_beneath_attr.parent_fd = root_fd;
+	ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+					&path_beneath_attr,
+					LANDLOCK_ADD_RULE_QUIET));
+	ASSERT_EQ(EINVAL, errno);
+
+	/* Check that the rule had not been added. */
+	ASSERT_EQ(0, close(root_fd));
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	ASSERT_EQ(-1, open("/", O_RDONLY | O_DIRECTORY | O_CLOEXEC));
+	ASSERT_EQ(EACCES, errno);
+}
+
+TEST(invalid_quiet_bits_1)
+{
+	struct landlock_ruleset_attr ruleset_attr = {
+		.handled_access_fs = LANDLOCK_ACCESS_FS_READ_DIR,
+		.quiet_access_fs = LANDLOCK_ACCESS_FS_WRITE_FILE,
+	};
+
+	ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr,
+					      sizeof(ruleset_attr), 0));
+	ASSERT_EQ(EINVAL, errno);
+}
+
+TEST(invalid_quiet_bits_2)
+{
+	struct landlock_ruleset_attr ruleset_attr = {
+		.handled_access_fs = LANDLOCK_ACCESS_FS_READ_DIR,
+		.quiet_access_fs = 1ULL << 63,
+	};
+
+	ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr,
+					      sizeof(ruleset_attr), 0));
+	ASSERT_EQ(EINVAL, errno);
+}
+
 TEST_HARNESS_MAIN
-- 
2.54.0

^ permalink raw reply related

* [PATCH v10 8/9] selftests/landlock: Add tests for quiet flag with scope
From: Tingmao Wang @ 2026-06-01  0:00 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, Justin Suess, Jan Kara,
	Abhinav Saxena, linux-security-module
In-Reply-To: <cover.1780272022.git.m@maowtm.org>

Enhance scoped_audit.connect_to_child and audit_flags.signal to test
interaction with various quiet flag settings.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---

Changes in v4:
- New patch

 tools/testing/selftests/landlock/audit_test.c | 25 ++++--
 .../landlock/scoped_abstract_unix_test.c      | 77 ++++++++++++++++---
 2 files changed, 87 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/landlock/audit_test.c b/tools/testing/selftests/landlock/audit_test.c
index 7044781357c0..de4c89cdc0be 100644
--- a/tools/testing/selftests/landlock/audit_test.c
+++ b/tools/testing/selftests/landlock/audit_test.c
@@ -794,30 +794,42 @@ FIXTURE(audit_flags)
 FIXTURE_VARIANT(audit_flags)
 {
 	const int restrict_flags;
+	const __u64 quiet_scoped;
 };
 
 /* clang-format off */
 FIXTURE_VARIANT_ADD(audit_flags, default) {
 	/* clang-format on */
 	.restrict_flags = 0,
+	.quiet_scoped = 0,
 };
 
 /* clang-format off */
 FIXTURE_VARIANT_ADD(audit_flags, same_exec_off) {
 	/* clang-format on */
 	.restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF,
+	.quiet_scoped = 0,
 };
 
 /* clang-format off */
 FIXTURE_VARIANT_ADD(audit_flags, subdomains_off) {
 	/* clang-format on */
 	.restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF,
+	.quiet_scoped = 0,
 };
 
 /* clang-format off */
 FIXTURE_VARIANT_ADD(audit_flags, cross_exec_on) {
 	/* clang-format on */
 	.restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON,
+	.quiet_scoped = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(audit_flags, signal_quieted) {
+	/* clang-format on */
+	.restrict_flags = 0,
+	.quiet_scoped = LANDLOCK_SCOPE_SIGNAL,
 };
 
 FIXTURE_SETUP(audit_flags)
@@ -861,12 +873,16 @@ TEST_F(audit_flags, signal)
 	pid_t child;
 	struct audit_records records;
 	__u64 deallocated_dom = 2;
+	bool expect_audit = !(variant->restrict_flags &
+			      LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF) &&
+			    !(variant->quiet_scoped & LANDLOCK_SCOPE_SIGNAL);
 
 	child = fork();
 	ASSERT_LE(0, child);
 	if (child == 0) {
 		const struct landlock_ruleset_attr ruleset_attr = {
 			.scoped = LANDLOCK_SCOPE_SIGNAL,
+			.quiet_scoped = variant->quiet_scoped,
 		};
 		int ruleset_fd;
 
@@ -883,8 +899,7 @@ TEST_F(audit_flags, signal)
 		EXPECT_EQ(-1, kill(getppid(), 0));
 		EXPECT_EQ(EPERM, errno);
 
-		if (variant->restrict_flags &
-		    LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF) {
+		if (!expect_audit) {
 			EXPECT_EQ(-EAGAIN, matches_log_signal(
 						   _metadata, self->audit_fd,
 						   getppid(), self->domain_id));
@@ -911,8 +926,7 @@ TEST_F(audit_flags, signal)
 
 		/* Makes sure there is no superfluous logged records. */
 		EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
-		if (variant->restrict_flags &
-		    LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF) {
+		if (!expect_audit) {
 			EXPECT_EQ(0, records.access);
 		} else {
 			EXPECT_EQ(1, records.access);
@@ -936,8 +950,7 @@ TEST_F(audit_flags, signal)
 	    WEXITSTATUS(status) != EXIT_SUCCESS)
 		_metadata->exit_code = KSFT_FAIL;
 
-	if (variant->restrict_flags &
-	    LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF) {
+	if (!expect_audit) {
 		/*
 		 * No deallocation record: denials=0 never matches a real
 		 * record.
diff --git a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
index 72f97648d4a7..d16555f7b0d3 100644
--- a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
+++ b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
@@ -293,6 +293,45 @@ FIXTURE_TEARDOWN_PARENT(scoped_audit)
 	EXPECT_EQ(0, audit_cleanup(-1, NULL));
 }
 
+FIXTURE_VARIANT(scoped_audit)
+{
+	const __u64 scoped;
+	const __u64 quiet_scoped;
+};
+
+// clang-format off
+FIXTURE_VARIANT_ADD(scoped_audit, no_quiet)
+{
+	// clang-format on
+	.scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET,
+	.quiet_scoped = 0,
+};
+
+// clang-format off
+FIXTURE_VARIANT_ADD(scoped_audit, quiet_abstract_socket)
+{
+	// clang-format on
+	.scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET,
+	.quiet_scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET,
+};
+
+// clang-format off
+FIXTURE_VARIANT_ADD(scoped_audit, quiet_abstract_socket_2)
+{
+	// clang-format on
+	.scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET | LANDLOCK_SCOPE_SIGNAL,
+	.quiet_scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
+			LANDLOCK_SCOPE_SIGNAL,
+};
+
+// clang-format off
+FIXTURE_VARIANT_ADD(scoped_audit, quiet_unrelated)
+{
+	// clang-format on
+	.scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET | LANDLOCK_SCOPE_SIGNAL,
+	.quiet_scoped = LANDLOCK_SCOPE_SIGNAL,
+};
+
 /* python -c 'print(b"\0selftests-landlock-abstract-unix-".hex().upper())' */
 #define ABSTRACT_SOCKET_PATH_PREFIX \
 	"0073656C6674657374732D6C616E646C6F636B2D61627374726163742D756E69782D"
@@ -308,6 +347,13 @@ TEST_F(scoped_audit, connect_to_child)
 	char buf;
 	int dgram_client;
 	struct audit_records records;
+	int ruleset_fd;
+	const struct landlock_ruleset_attr ruleset_attr = {
+		.scoped = variant->scoped,
+		.quiet_scoped = variant->quiet_scoped,
+	};
+	bool should_audit =
+		!(variant->quiet_scoped & LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET);
 
 	/* Makes sure there is no superfluous logged records. */
 	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
@@ -345,7 +391,14 @@ TEST_F(scoped_audit, connect_to_child)
 	EXPECT_EQ(0, close(pipe_child[1]));
 	EXPECT_EQ(0, close(pipe_parent[0]));
 
-	create_scoped_domain(_metadata, LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET);
+	ruleset_fd =
+		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+	ASSERT_LE(0, ruleset_fd)
+	{
+		TH_LOG("Failed to create a ruleset: %s", strerror(errno));
+	}
+	enforce_ruleset(_metadata, ruleset_fd);
+	EXPECT_EQ(0, close(ruleset_fd));
 
 	/* Signals that the parent is in a domain, if any. */
 	ASSERT_EQ(1, write(pipe_parent[1], ".", 1));
@@ -360,14 +413,20 @@ TEST_F(scoped_audit, connect_to_child)
 	EXPECT_EQ(-1, err_dgram);
 	EXPECT_EQ(EPERM, errno);
 
-	EXPECT_EQ(
-		0,
-		audit_match_record(
-			self->audit_fd, AUDIT_LANDLOCK_ACCESS,
-			REGEX_LANDLOCK_PREFIX
-			" blockers=scope\\.abstract_unix_socket path=" ABSTRACT_SOCKET_PATH_PREFIX
-			"[0-9A-F]\\+$",
-			NULL));
+	if (should_audit) {
+		EXPECT_EQ(
+			0,
+			audit_match_record(
+				self->audit_fd, AUDIT_LANDLOCK_ACCESS,
+				REGEX_LANDLOCK_PREFIX
+				" blockers=scope\\.abstract_unix_socket path=" ABSTRACT_SOCKET_PATH_PREFIX
+				"[0-9A-F]\\+$",
+				NULL));
+	}
+
+	/* No other logs */
+	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+	EXPECT_EQ(0, records.access);
 
 	ASSERT_EQ(1, write(pipe_parent[1], ".", 1));
 	EXPECT_EQ(0, close(dgram_client));
-- 
2.54.0

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox