Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [PATCH v5 4/4] tpm: tpm_crb_ffa: revert defered_probed when tpm_crb_ffa is built-in
From: Yeoreum Yun @ 2026-06-02 12:57 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-security-module, linux-kernel, linux-integrity, paul, zohar,
	roberto.sassu, noodles, sudeep.holla, jmorris, serge,
	dmitry.kasatkin, eric.snowberg, jgg
In-Reply-To: <ah7TAk3iItltddzT@e129823.arm.com>

> > On Mon, Jun 01, 2026 at 03:27:49PM +0100, Yeoreum Yun wrote:
> > > commit 746d9e9f62a6 ("tpm: tpm_crb_ffa: try to probe tpm_crb_ffa when it's build_in")
> > > probe tpm_crb_ffa forcefully when it's built-in to integrate with IMA.
> > > 
> > > However, IMA now provides the IMA_INIT_LATE_SYNC build option, which
> > > initialises IMA at the late_initcall_sync level, so this change is no
> > > longer required.
> > > 
> > > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > 
> > Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
> 
> Might be rb tag?. Thanks!

Ah Sorry. SOB is right thanks!

-- 
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [PATCH v5 3/4] security: ima: rename boot_aggregate when ima is initialised at late_sync
From: Yeoreum Yun @ 2026-06-02 12:58 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: linux-security-module, linux-kernel, linux-integrity, paul,
	roberto.sassu, noodles, jarkko, sudeep.holla, jmorris, serge,
	dmitry.kasatkin, eric.snowberg, jgg, Jonathan McDowell
In-Reply-To: <5c52effb1b4723c025f478c1c902bf83a9a4d0ed.camel@linux.ibm.com>

Hi Mimi,

> On Mon, 2026-06-01 at 15:27 +0100, Yeoreum Yun wrote:
> > From: Jonathan McDowell <noodles@meta.com>
> > 
> > The Linux IMA (Integrity Measurement Architecture) subsystem used for
> > secure boot, file integrity, or remote attestation cannot be a loadable
> > module for few reasons listed below:
> > 
> >  o Boot-Time Integrity: IMA’s main role is to measure and appraise files
> >    before they are used. This includes measuring critical system files
> >    during early boot (e.g., init, init scripts, login binaries). If IMA
> >    were a module, it would be loaded too late to cover those.
> > 
> >  o TPM Dependency: IMA integrates tightly with the TPM to record
> >    measurements into PCRs. The TPM must be initialized early (ideally
> >    before init_ima()), which aligns with IMA being built-in.
> > 
> >  o Security Model: IMA is part of a Trusted Computing Base (TCB). Making
> >    it a module would weaken the security model, as a potentially
> >    compromised system could delay or tamper with its initialization.
> > 
> > IMA must be built-in to ensure it starts measuring from the earliest
> > possible point in boot which inturn implies TPM must be initialised and
> > ready to use before IMA.
> > 
> > Unfortunately some TPM drivers (such as Arm FF-A, or SPI attached TPM
> > devices) are not reliably available during the initcall_late stage,
> > resulting in a log error:
> > 
> >   ima: No TPM chip found, activating TPM-bypass!
> > 
> > To address this issue, IMA_INIT_LATE_SYNC is introduced.
> > However, a remote attestation service cannot determine when IMA has been
> > initialized because the boot_aggregate measurement name remains unchanged,
> > even though IMA is initialized later at late_initcall_sync when
> > IMA_INIT_LATE_SYNC is enabled.
> > 
> > Therefore, use a distinct boot_aggregate name when IMA_INIT_LATE_SYNC
> > is enabled, allowing the remote attestation service to identify
> > when IMA has been initialized.
> > 
> > Signed-off-by: Jonathan McDowell <noodles@meta.com>
> > [yeoreum.yun@arm.com: modified to align with the IMA_INIT_LATE_SYNC change]
> 
> Thanks, Yeoreum. This version requires your Signed-off-by tag as well as
> Jonathan's.  Otherwise the patch looks good.

Thanks! I'll resend with my SOB again!

-- 
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [PATCH v5 2/4] security: ima: introduce IMA_INIT_LATE_SYNC option
From: Yeoreum Yun @ 2026-06-02 12:58 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: linux-security-module, linux-kernel, linux-integrity, paul,
	roberto.sassu, noodles, jarkko, sudeep.holla, jmorris, serge,
	dmitry.kasatkin, eric.snowberg, jgg
In-Reply-To: <1cb6e74f1d63bd256d70e5c026234d4535acb662.camel@linux.ibm.com>

On Tue, Jun 02, 2026 at 08:35:52AM -0400, Mimi Zohar wrote:
> On Mon, 2026-06-01 at 15:27 +0100, Yeoreum Yun wrote:
> > To generate the boot_aggregate log in the IMA subsystem with TPM PCR values,
> > the TPM driver must be built as built-in and
> > must be probed before the IMA subsystem is initialized.
> > 
> > However, when the TPM device operates over the FF-A protocol using
> > the CRB interface, probing fails and returns -EPROBE_DEFER if
> > the tpm_crb_ffa device — an FF-A device that provides the communication
> > interface to the tpm_crb driver — has not yet been probed.
> > 
> > To ensure the TPM device operating over the FF-A protocol with
> > the CRB interface is probed before IMA initialization,
> > the following conditions must be met:
> > 
> > 1. The corresponding ffa_device must be registered,
> >    which is done via ffa_init().
> > 
> > 2. The tpm_crb_driver must successfully probe this device via
> >    tpm_crb_ffa_init().
> > 
> > 3. The tpm_crb driver using CRB over FF-A can then
> >    be probed successfully. (See crb_acpi_add() and
> >    tpm_crb_ffa_init() for reference.)
> > 
> > Unfortunately, ffa_init(), tpm_crb_ffa_init(), and crb_acpi_driver_init() are
> > all registered with device_initcall, which means crb_acpi_driver_init() may
> > be invoked before ffa_init() and tpm_crb_ffa_init() are completed.
> > 
> > When this occurs, probing the TPM device is deferred.
> > However, the deferred probe can happen after the IMA subsystem
> > has already been initialized, since IMA initialization is performed
> > during late_initcall, and deferred_probe_initcall() is performed
> > at the same level.
> > 
> > And the similar situation is reported on TPM devices attached on SPI
> > bus[0].
> > 
> > To resolve this, introduce IMA_INIT_LATE_SYNC option to initialise
> > IMA at late_inicall_sync so that IMA is initialized with the TPM
> > device probed deferred.
> > 
> > When this option is enabled, modules that access files in the
> > initramfs through usermode helper calls such as request_module()
> > during initcall must not be built-in. Otherwise, IMA may miss
> > measuring those files [1].
> > 
> > Link: https://lore.kernel.org/all/aYXEepLhUouN5f99@earth.li/ [0]
> > Link: https://lore.kernel.org/all/2b3782398cc17ce9d355490a0c42ebce9120a9ae.camel@linux.ibm.com/ [1]
> > Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> 
> Reviewed-by:  Mimi Zohar <zohar@linux.ibm.com>

Thanks!

-- 
Sincerely,
Yeoreum Yun

^ permalink raw reply

* Re: [PATCH 01/11] params: bound array element output to the caller's page buffer
From: David Laight @ 2026-06-02 13:04 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Kees Cook, Luis Chamberlain, Pengpeng Hou, stable, Petr Pavlu,
	Richard Weinberger, Anton Ivanov, Johannes Berg,
	Rafael J. Wysocki, Len Brown, Corey Minyard, Gabriel Somlo,
	Michael S. Tsirkin, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Tvrtko Ursulin, David Airlie, Simona Vetter, Bart Van Assche,
	Jason Gunthorpe, Leon Romanovsky, Laurent Pinchart, Hans de Goede,
	Mauro Carvalho Chehab, Bjorn Helgaas, Hannes Reinecke,
	James E.J. Bottomley, Martin K. Petersen, Daniel Lezcano,
	Zhang Rui, Lukasz Luba, Greg Kroah-Hartman, Jiri Slaby,
	Alan Stern, Jason Wang, Xuan Zhuo, Eugenio Pérez,
	Jason Baron, Jim Cromie, Tiwei Bie, Benjamin Berg,
	Ilpo Järvinen, David E. Box, Maciej W. Rozycki,
	Srinivas Pandruvada, Peter Zijlstra, Heiko Carstens,
	Vasily Gorbik, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Vinod Koul, Frank Li, Daniel Gomez, Sami Tolvanen,
	Aaron Tomlin, Alexander Potapenko, Marco Elver, Dmitry Vyukov,
	Andrew Morton, John Johansen, Paul Moore, James Morris,
	Serge E. Hallyn, Georgia Garcia, kvm, dmaengine, linux-modules,
	kasan-dev, linux-mm, apparmor, linux-security-module, linux-um,
	linux-acpi, openipmi-developer, qemu-devel, intel-gfx, dri-devel,
	linux-rdma, linux-media, linux-pci, linux-scsi, linux-pm,
	linuxppc-dev, linux-serial, linux-usb, usb-storage,
	virtualization, linux-kernel, linux-arch, netdev, linux-fsdevel,
	linux-hardening
In-Reply-To: <ah699hwLxIIOZ0-7@ashevche-desk.local>

On Tue, 2 Jun 2026 14:26:46 +0300
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

> On Thu, May 21, 2026 at 06:33:14AM -0700, Kees Cook wrote:
> > 
> > param_array_get() appends each element's string representation into the
> > shared sysfs page buffer by passing buffer + off to the element getter.
> > 
> > That works for getters that only write a small bounded string, but
> > param_get_charp() and similar helpers format against PAGE_SIZE from the
> > pointer they receive. Once off is non-zero, an element getter can
> > therefore write past the end of the original sysfs page buffer.
> > 
> > Collect each element into a temporary PAGE_SIZE buffer first and then
> > copy only the remaining space into the caller's page buffer.  
> 
> ...
> 
> > +	elem_buf = kmalloc(PAGE_SIZE, GFP_KERNEL);  
> 
> get_free_page() (or how it is called)?

The kmalloc() should be faster and I think has to be aligned.
There is another patch set to replace get_free_pages() with kmalloc().

Although all these 'show' functions should really head to using a safer
interface.
Although, at the moment, it is really difficult to find the ones that
are guaranteed to be passed a page aligned buffer.

-- David

> 
> > +	if (!elem_buf)
> > +		return -ENOMEM;
> > +
> >  	for (i = off = 0; i < (arr->num ? *arr->num : arr->max); i++) {
> > -		/* Replace \n with comma */
> > -		if (i)
> > -			buffer[off - 1] = ',';
> >  		p.arg = arr->elem + arr->elemsize * i;
> >  		check_kparam_locked(p.mod);
> > -		ret = arr->ops->get(buffer + off, &p);
> > +		ret = arr->ops->get(elem_buf, &p);
> >  		if (ret < 0)
> > -			return ret;
> > +			goto out;
> > +		ret = min(ret, (int)(PAGE_SIZE - 1 - off));  
> 
> It's usually discouraged to use castings in min/max/clamp. Can we make ret long
> or do something different here?
> 
> > +		if (!ret)
> > +			break;  
> 
> > +		/* Replace the previous element's trailing newline with a comma. */
> > +		if (i)
> > +			buffer[off - 1] = ',';  
> 
> Can't we do this after with help of strreplace()?
> 
> > +		memcpy(buffer + off, elem_buf, ret);
> >  		off += ret;
> > +		if (off == PAGE_SIZE - 1)
> > +			break;
> >  	}
> >  	buffer[off] = '\0';
> > -	return off;
> > +	ret = off;
> > +out:
> > +	kfree(elem_buf);
> > +	return ret;  
> 


^ permalink raw reply

* Re: [PATCH 02/11] hornet: invert map set check logic
From: Blaise Boscaccy @ 2026-06-02 16:57 UTC (permalink / raw)
  To: Paul Moore, Fan Wu
  Cc: Jonathan Corbet, Shuah Khan, James Morris, Serge E. Hallyn,
	Eric Biggers, James.Bottomley, linux-security-module
In-Reply-To: <CAHC9VhQ_c9BOLXbYRk3+9_TPHbFW93-NeTe3fypxnkQOif69TQ@mail.gmail.com>

Paul Moore <paul@paul-moore.com> writes:

> On Fri, May 29, 2026 at 8:57 PM Fan Wu <wufan@kernel.org> wrote:
>>
>> On Wed, May 27, 2026 at 8:09 PM Blaise Boscaccy
>> <bboscaccy@linux.microsoft.com> wrote:
>> >
>> > In a multi-map hash verification scenario, a logic bug may have
>> > allowed an attacker to provide duplicate maps to satisfy the hash
>> > check count. Instead, invert the logic to verify each map discretely
>> >
>> > Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
>> > ---
>>
>> I just realized there is no audit event if hornet_check_prog_maps()
>> fails, probably should add one.
>
> Maybe, but I think it is important to remember that not all LSMs use
> audit for reporting, and Hornet is doing some new things from an LSM
> perspective.  I think for right now it would be sufficient to use a
> pr_notice() or a pr_notice_ratelimited() (if we are worried about
> unpriv log spam) message in hornet_check_prog_maps().  Hornet can
> always add proper audit support at a later date if deemed necessary.
>
> Blaise, do you want to submit a patch to add pr_notice{_ratelimited}()
> in the case of denial in hornet_check_prog_maps()?
>

Yeah, that works.

-blaise

> -- 
> paul-moore.com

^ permalink raw reply

* [RFC PATCH] hornet: adjustments for the updated bpf_map_ops::map_get_hash() API
From: Paul Moore @ 2026-06-02 18:36 UTC (permalink / raw)
  To: linux-security-module; +Cc: Blaise Boscaccy

Commit c48c3a7e7d5b ("bpf: Drop redundant hash_buf from map_get_hash
operation") changed the map_get_hash() API to only take a single
parameter, the bpf_map instance; this commit updates the Hornet code
accordingly.

Beyond the basic map_get_hash() usage change, this commit also removes
the remaining SHA-256 specific code from Hornet, instead relying on the
size of the bpf_map::sha field to determine the appropriate digest size.
While Hornet remains tied to SHA-256 because it is hardcoded into the
BPF subsystem, the Hornet code itself should now be fairly agile with
respect to hash algorithms.  The only area where Hornet does appear to
hardcode a hash algorithm is in the MAP_DIGEST_SIZE macro where the
bpf_map::sha field is referenced, but that is purely a field name and
if the BPF subsystem changes the name to something more generic it will
be easily caught and corrected at build time.

Signed-off-by: Paul Moore <paul@paul-moore.com>
---
 security/hornet/hornet_lsm.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/security/hornet/hornet_lsm.c b/security/hornet/hornet_lsm.c
index eeb422db1092..a1cb2e130323 100644
--- a/security/hornet/hornet_lsm.c
+++ b/security/hornet/hornet_lsm.c
@@ -17,16 +17,17 @@
 #include <linux/sort.h>
 #include <linux/asn1_decoder.h>
 #include <linux/oid_registry.h>
+#include <linux/stddef.h>
 #include "hornet.asn1.h"
 
 #define MAX_USED_MAPS 64
 
-/* The only hashing algorithm available is SHA256 due to it be hardcoded
- * in the bpf subsystem.
- */
+/* Use the hash alg hardcoded into the bpf subsystem, currently sha256 */
+#define MAP_DIGEST_SIZE (sizeof_field(struct bpf_map, sha))
+
 struct hornet_prog_security_struct {
 	int signed_hash_count;
-	unsigned char signed_hashes[SHA256_DIGEST_SIZE * MAX_USED_MAPS];
+	unsigned char signed_hashes[MAP_DIGEST_SIZE * MAX_USED_MAPS];
 };
 
 struct hornet_parse_context {
@@ -60,12 +61,12 @@ int hornet_map_hash(void *context, size_t hdrlen,
 {
 	struct hornet_parse_context *ctx = (struct hornet_parse_context *)context;
 
-	if (vlen != SHA256_DIGEST_SIZE && vlen != 0)
+	if (vlen != MAP_DIGEST_SIZE && vlen != 0)
 		return -EINVAL;
 	if (ctx->security->signed_hash_count >= MAX_USED_MAPS)
 		return -EINVAL;
 
-	memcpy(&ctx->security->signed_hashes[ctx->security->signed_hash_count * SHA256_DIGEST_SIZE],
+	memcpy(&ctx->security->signed_hashes[ctx->security->signed_hash_count * MAP_DIGEST_SIZE],
 	       value, vlen);
 
 	return 0;
@@ -188,7 +189,6 @@ static int hornet_bpf_prog_load_integrity(struct bpf_prog *prog, union bpf_attr
 static int hornet_check_prog_maps(struct bpf_prog *prog)
 {
 	struct hornet_prog_security_struct *security;
-	unsigned char hash[SHA256_DIGEST_SIZE];
 	struct bpf_map *map;
 	int i, j;
 	bool found;
@@ -209,12 +209,12 @@ static int hornet_check_prog_maps(struct bpf_prog *prog)
 			if (!READ_ONCE(map->frozen) || !map->ops->map_get_hash)
 				continue;
 
-			if (map->ops->map_get_hash(map, SHA256_DIGEST_SIZE, hash))
+			if (map->ops->map_get_hash(map))
 				continue;
 
-			if (memcmp(hash,
-				   &security->signed_hashes[i * SHA256_DIGEST_SIZE],
-				   SHA256_DIGEST_SIZE) == 0) {
+			if (memcmp(map->sha,
+				   &security->signed_hashes[i * MAP_DIGEST_SIZE],
+				   MAP_DIGEST_SIZE) == 0) {
 				found = true;
 				break;
 			}
-- 
2.54.0


^ permalink raw reply related

* Re: [RFC PATCH] hornet: adjustments for the updated bpf_map_ops::map_get_hash() API
From: Blaise Boscaccy @ 2026-06-02 19:50 UTC (permalink / raw)
  To: Paul Moore, linux-security-module
In-Reply-To: <20260602183658.161744-2-paul@paul-moore.com>

Paul Moore <paul@paul-moore.com> writes:

> Commit c48c3a7e7d5b ("bpf: Drop redundant hash_buf from map_get_hash
> operation") changed the map_get_hash() API to only take a single
> parameter, the bpf_map instance; this commit updates the Hornet code
> accordingly.
>
> Beyond the basic map_get_hash() usage change, this commit also removes
> the remaining SHA-256 specific code from Hornet, instead relying on the
> size of the bpf_map::sha field to determine the appropriate digest size.
> While Hornet remains tied to SHA-256 because it is hardcoded into the
> BPF subsystem, the Hornet code itself should now be fairly agile with
> respect to hash algorithms.  The only area where Hornet does appear to
> hardcode a hash algorithm is in the MAP_DIGEST_SIZE macro where the
> bpf_map::sha field is referenced, but that is purely a field name and
> if the BPF subsystem changes the name to something more generic it will
> be easily caught and corrected at build time.
>
> Signed-off-by: Paul Moore <paul@paul-moore.com>
> ---
>  security/hornet/hornet_lsm.c | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/security/hornet/hornet_lsm.c b/security/hornet/hornet_lsm.c
> index eeb422db1092..a1cb2e130323 100644
> --- a/security/hornet/hornet_lsm.c
> +++ b/security/hornet/hornet_lsm.c
> @@ -17,16 +17,17 @@
>  #include <linux/sort.h>
>  #include <linux/asn1_decoder.h>
>  #include <linux/oid_registry.h>
> +#include <linux/stddef.h>
>  #include "hornet.asn1.h"
>  
>  #define MAX_USED_MAPS 64
>  
> -/* The only hashing algorithm available is SHA256 due to it be hardcoded
> - * in the bpf subsystem.
> - */
> +/* Use the hash alg hardcoded into the bpf subsystem, currently sha256 */
> +#define MAP_DIGEST_SIZE (sizeof_field(struct bpf_map, sha))
> +
>  struct hornet_prog_security_struct {
>  	int signed_hash_count;
> -	unsigned char signed_hashes[SHA256_DIGEST_SIZE * MAX_USED_MAPS];
> +	unsigned char signed_hashes[MAP_DIGEST_SIZE * MAX_USED_MAPS];
>  };
>  
>  struct hornet_parse_context {
> @@ -60,12 +61,12 @@ int hornet_map_hash(void *context, size_t hdrlen,
>  {
>  	struct hornet_parse_context *ctx = (struct hornet_parse_context *)context;
>  
> -	if (vlen != SHA256_DIGEST_SIZE && vlen != 0)
> +	if (vlen != MAP_DIGEST_SIZE && vlen != 0)
>  		return -EINVAL;
>  	if (ctx->security->signed_hash_count >= MAX_USED_MAPS)
>  		return -EINVAL;
>  
> -	memcpy(&ctx->security->signed_hashes[ctx->security->signed_hash_count * SHA256_DIGEST_SIZE],
> +	memcpy(&ctx->security->signed_hashes[ctx->security->signed_hash_count * MAP_DIGEST_SIZE],
>  	       value, vlen);
>  
>  	return 0;
> @@ -188,7 +189,6 @@ static int hornet_bpf_prog_load_integrity(struct bpf_prog *prog, union bpf_attr
>  static int hornet_check_prog_maps(struct bpf_prog *prog)
>  {
>  	struct hornet_prog_security_struct *security;
> -	unsigned char hash[SHA256_DIGEST_SIZE];
>  	struct bpf_map *map;
>  	int i, j;
>  	bool found;
> @@ -209,12 +209,12 @@ static int hornet_check_prog_maps(struct bpf_prog *prog)
>  			if (!READ_ONCE(map->frozen) || !map->ops->map_get_hash)
>  				continue;
>  
> -			if (map->ops->map_get_hash(map, SHA256_DIGEST_SIZE, hash))
> +			if (map->ops->map_get_hash(map))
>  				continue;
>  
> -			if (memcmp(hash,
> -				   &security->signed_hashes[i * SHA256_DIGEST_SIZE],
> -				   SHA256_DIGEST_SIZE) == 0) {
> +			if (memcmp(map->sha,
> +				   &security->signed_hashes[i * MAP_DIGEST_SIZE],
> +				   MAP_DIGEST_SIZE) == 0) {
>  				found = true;
>  				break;
>  			}
> -- 
> 2.54.0

Acked-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>

^ permalink raw reply

* [PATCH 0/3]  hornet: post-TOCTOU-fix cleanup and observability
From: Blaise Boscaccy @ 2026-06-02 20:23 UTC (permalink / raw)
  To: Paul Moore, Fan Wu, Blaise Boscaccy, linux-security-module

This is a small follow-up series tying up loose ends from
commit cf5d6b993a43 ("hornet: fix TOCTOU in signed program
verification").

Patch 1 adds a pr_notice() when hornet_check_prog_maps()
rejects a load due to a map hash mismatch. The denial path
was previously silent; this makes policy denials observable
in the kernel log without changing enforcement behavior.

Patch 2 removes LSM_INT_VERDICT_UNEXPECTED from the
lsm_integrity_verdict enum and from IPE's bpf_signature
property. The TOCTOU fix collapsed the "unexpected map hash"
case into the existing BADSIG path, so UNEXPECTED is no
longer produced by any LSM. Removing the orphan enum value
and its IPE plumbing (audit string, property enum entry,
policy parser token, evaluator case, documentation) keeps
the verdict surface consistent with what providers actually
emit.

Patch 3 updates the signing-workflow documentation in
Documentation/admin-guide/LSM/Hornet.rst. gen_sig no longer
takes per-map indices after the TOCTOU fix, so the example
invocation is corrected to drop the ":0" suffix on --add.

No functional change to enforcement; observability +
cleanup only.

Blaise Boscaccy (3):
  hornet: log map hash check failures in prog map validation
  security, ipe: Remove LSM_INT_VERDICT_UNEXPECTED support
  hornet: update signing workflow documentation

 Documentation/admin-guide/LSM/Hornet.rst | 5 +----
 Documentation/admin-guide/LSM/ipe.rst    | 6 +-----
 Documentation/security/ipe.rst           | 3 +--
 include/linux/security.h                 | 1 -
 security/hornet/hornet_lsm.c             | 1 +
 security/ipe/audit.c                     | 1 -
 security/ipe/eval.c                      | 2 --
 security/ipe/policy.h                    | 1 -
 security/ipe/policy_parser.c             | 2 --
 9 files changed, 4 insertions(+), 18 deletions(-)

--
2.53.0

^ permalink raw reply

* [PATCH 1/3] hornet: log map hash check failures in prog map validation
From: Blaise Boscaccy @ 2026-06-02 20:23 UTC (permalink / raw)
  To: Paul Moore, Fan Wu, Blaise Boscaccy, linux-security-module
In-Reply-To: <20260602202336.3579863-1-bboscaccy@linux.microsoft.com>

Add a pr_notice() before returning -EPERM when
hornet_check_prog_maps() fails to find a matching map hash.

This makes policy denials observable in kernel logs and improves
triage/debuggability of rejected BPF program loads without changing
enforcement behavior.

Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
---
 security/hornet/hornet_lsm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/security/hornet/hornet_lsm.c b/security/hornet/hornet_lsm.c
index eeb422db1092d..fe133a0e8a11a 100644
--- a/security/hornet/hornet_lsm.c
+++ b/security/hornet/hornet_lsm.c
@@ -221,6 +221,7 @@ static int hornet_check_prog_maps(struct bpf_prog *prog)
 		}
 		if (!found) {
 			mutex_unlock(&prog->aux->used_maps_mutex);
+			pr_notice("hornet: map hash check failed");
 			return -EPERM;
 		}
 	}
-- 
2.53.0


^ permalink raw reply related

* [PATCH 2/3] security, ipe: Remove LSM_INT_VERDICT_UNEXPECTED support
From: Blaise Boscaccy @ 2026-06-02 20:23 UTC (permalink / raw)
  To: Paul Moore, Fan Wu, Blaise Boscaccy, linux-security-module
In-Reply-To: <20260602202336.3579863-1-bboscaccy@linux.microsoft.com>

After commit cf5d6b993a43 ("hornet: fix TOCTOU in signed program
verification") LSM_INT_VERDICT_UNEXPECTED was no longer being produced
by any LSMs. Remove support for the orphaned enum value from IPE and
the set of possible verdicts.

Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
---
 Documentation/admin-guide/LSM/Hornet.rst | 3 ---
 Documentation/admin-guide/LSM/ipe.rst    | 6 +-----
 Documentation/security/ipe.rst           | 3 +--
 include/linux/security.h                 | 1 -
 security/ipe/audit.c                     | 1 -
 security/ipe/eval.c                      | 2 --
 security/ipe/policy.h                    | 1 -
 security/ipe/policy_parser.c             | 2 --
 8 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/Documentation/admin-guide/LSM/Hornet.rst b/Documentation/admin-guide/LSM/Hornet.rst
index a369bc11408f4..13dcf686ead71 100644
--- a/Documentation/admin-guide/LSM/Hornet.rst
+++ b/Documentation/admin-guide/LSM/Hornet.rst
@@ -47,9 +47,6 @@ make policy decisions based on the verification outcome:
 ``LSM_INT_VERDICT_FAULT``
   A system error occurred during verification.
 
-``LSM_INT_VERDICT_UNEXPECTED``
-  An unexpected map hash value was encountered.
-
 ``LSM_INT_VERDICT_BADSIG``
   The signature or a map hash failed verification.
 
diff --git a/Documentation/admin-guide/LSM/ipe.rst b/Documentation/admin-guide/LSM/ipe.rst
index d68ba9d98859e..a525b4cbb4f09 100644
--- a/Documentation/admin-guide/LSM/ipe.rst
+++ b/Documentation/admin-guide/LSM/ipe.rst
@@ -736,7 +736,7 @@ bpf_signature
    ``IPE_PROP_BPF_SIGNATURE`` config option.
    The format of this property is::
 
-      bpf_signature=(NONE|OK|UNSIGNED|PARTIALSIG|UNKNOWNKEY|UNEXPECTED|FAULT|BADSIG)
+      bpf_signature=(NONE|OK|UNSIGNED|PARTIALSIG|UNKNOWNKEY|FAULT|BADSIG)
 
    The possible values correspond to the integrity verdicts from Hornet:
 
@@ -762,10 +762,6 @@ bpf_signature
 
          The keyring requested by the user is invalid.
 
-      ``UNEXPECTED``
-
-         An unexpected map hash value was encountered during verification.
-
       ``FAULT``
 
          A system error occurred during signature verification.
diff --git a/Documentation/security/ipe.rst b/Documentation/security/ipe.rst
index c51dcb16a377b..6a8d28a1b6be0 100644
--- a/Documentation/security/ipe.rst
+++ b/Documentation/security/ipe.rst
@@ -439,8 +439,7 @@ The hook flow is:
      ``attr->fd_array``. The function produces one of
      ``LSM_INT_VERDICT_OK``, ``LSM_INT_VERDICT_UNSIGNED``,
      ``LSM_INT_VERDICT_BADSIG``, ``LSM_INT_VERDICT_PARTIALSIG``,
-     ``LSM_INT_VERDICT_UNKNOWNKEY``, ``LSM_INT_VERDICT_UNEXPECTED``, or
-     ``LSM_INT_VERDICT_FAULT``.
+     ``LSM_INT_VERDICT_UNKNOWNKEY``, or ``LSM_INT_VERDICT_FAULT``.
   3. Hornet calls ``security_bpf_prog_load_post_integrity()`` with the
      resulting verdict and its ``lsm_id``. IPE's
      ``ipe_bpf_prog_load_post_integrity`` handler does **not** enforce
diff --git a/include/linux/security.h b/include/linux/security.h
index 598cd2eb1dcd5..2476ece76db73 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -107,7 +107,6 @@ enum lsm_integrity_verdict {
 	LSM_INT_VERDICT_UNSIGNED,
 	LSM_INT_VERDICT_PARTIALSIG,
 	LSM_INT_VERDICT_UNKNOWNKEY,
-	LSM_INT_VERDICT_UNEXPECTED,
 	LSM_INT_VERDICT_FAULT,
 	LSM_INT_VERDICT_BADSIG,
 };
diff --git a/security/ipe/audit.c b/security/ipe/audit.c
index 77bbf04d950bd..a2ae22cbc61ed 100644
--- a/security/ipe/audit.c
+++ b/security/ipe/audit.c
@@ -69,7 +69,6 @@ static const char *const audit_prop_names[__IPE_PROP_MAX] = {
 	"bpf_signature=UNSIGNED",
 	"bpf_signature=PARTIALSIG",
 	"bpf_signature=UNKNOWNKEY",
-	"bpf_signature=UNEXPECTED",
 	"bpf_signature=FAULT",
 	"bpf_signature=BADSIG",
 	"bpf_keyring=BUILTIN",
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index 23ae1edf896b0..58a168e9ebe2b 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -374,8 +374,6 @@ static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
 		return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_PARTIALSIG);
 	case IPE_PROP_BPF_SIG_UNKNOWNKEY:
 		return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_UNKNOWNKEY);
-	case IPE_PROP_BPF_SIG_UNEXPECTED:
-		return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_UNEXPECTED);
 	case IPE_PROP_BPF_SIG_FAULT:
 		return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_FAULT);
 	case IPE_PROP_BPF_SIG_BADSIG:
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
index 748bea92beb19..ba4f529da7d72 100644
--- a/security/ipe/policy.h
+++ b/security/ipe/policy.h
@@ -45,7 +45,6 @@ enum ipe_prop_type {
 	IPE_PROP_BPF_SIG_UNSIGNED,
 	IPE_PROP_BPF_SIG_PARTIALSIG,
 	IPE_PROP_BPF_SIG_UNKNOWNKEY,
-	IPE_PROP_BPF_SIG_UNEXPECTED,
 	IPE_PROP_BPF_SIG_FAULT,
 	IPE_PROP_BPF_SIG_BADSIG,
 	IPE_PROP_BPF_KEYRING_BUILTIN,
diff --git a/security/ipe/policy_parser.c b/security/ipe/policy_parser.c
index 71f63de56616b..b2b807620d89a 100644
--- a/security/ipe/policy_parser.c
+++ b/security/ipe/policy_parser.c
@@ -287,7 +287,6 @@ static const match_table_t property_tokens = {
 	{IPE_PROP_BPF_SIG_UNSIGNED,	"bpf_signature=UNSIGNED"},
 	{IPE_PROP_BPF_SIG_PARTIALSIG,	"bpf_signature=PARTIALSIG"},
 	{IPE_PROP_BPF_SIG_UNKNOWNKEY,	"bpf_signature=UNKNOWNKEY"},
-	{IPE_PROP_BPF_SIG_UNEXPECTED,	"bpf_signature=UNEXPECTED"},
 	{IPE_PROP_BPF_SIG_FAULT,	"bpf_signature=FAULT"},
 	{IPE_PROP_BPF_SIG_BADSIG,	"bpf_signature=BADSIG"},
 	{IPE_PROP_BPF_KEYRING_BUILTIN,	"bpf_keyring=BUILTIN"},
@@ -350,7 +349,6 @@ static int parse_property(char *t, struct ipe_rule *r)
 	case IPE_PROP_BPF_SIG_UNSIGNED:
 	case IPE_PROP_BPF_SIG_PARTIALSIG:
 	case IPE_PROP_BPF_SIG_UNKNOWNKEY:
-	case IPE_PROP_BPF_SIG_UNEXPECTED:
 	case IPE_PROP_BPF_SIG_FAULT:
 	case IPE_PROP_BPF_SIG_BADSIG:
 	case IPE_PROP_BPF_KEYRING_BUILTIN:
-- 
2.53.0


^ permalink raw reply related

* [PATCH 3/3] hornet: update signing workflow documentation
From: Blaise Boscaccy @ 2026-06-02 20:23 UTC (permalink / raw)
  To: Paul Moore, Fan Wu, Blaise Boscaccy, linux-security-module
In-Reply-To: <20260602202336.3579863-1-bboscaccy@linux.microsoft.com>

After commit cf5d6b993a43 ("hornet: fix TOCTOU in signed program
verification") map indices are no longer passed into gen_sig. Fix the
lingering documentation reference.

Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
---
 Documentation/admin-guide/LSM/Hornet.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/LSM/Hornet.rst b/Documentation/admin-guide/LSM/Hornet.rst
index 13dcf686ead71..6551134d8fd59 100644
--- a/Documentation/admin-guide/LSM/Hornet.rst
+++ b/Documentation/admin-guide/LSM/Hornet.rst
@@ -290,7 +290,7 @@ A typical workflow for building and signing an eBPF light skeleton is:
        --key signing_key.pem \
        --cert signing_key.x509 \
        --data insn.bin \
-       --add map.bin:0 \
+       --add map.bin \
        --out sig.bin
 
 5. **Embed the signature** back into the header::
-- 
2.53.0


^ permalink raw reply related

* [net v3] netlabel: validate unlabeled address and mask attribute lengths
From: Chenguang Zhao @ 2026-06-03  1:13 UTC (permalink / raw)
  To: Paul Moore, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Chenguang Zhao, Simon Horman, netdev, linux-security-module

netlbl_unlabel_addrinfo_get() used the address attribute length to
determine whether the attribute data could be read as an IPv4 or IPv6
address, but did not independently validate the corresponding mask
attribute length.  A crafted Generic Netlink request could therefore
provide a valid IPv4/IPv6 address attribute with a shorter mask
attribute, which would later be read as a full struct in_addr or
struct in6_addr.

NLA_BINARY policy lengths are maximum lengths by default, so use
NLA_POLICY_EXACT_LEN() for the unlabeled IPv4/IPv6 address and mask
attributes.  This rejects short attributes during policy validation and
also exposes the exact length requirements through policy introspection.

Fixes: 8cc44579d1bd ("NetLabel: Introduce static network labels for unlabeled connections")
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
---
v3:
 Use NLA_POLICY_EXACT_LEN() as suggested by Jakub

v2:
 https://lore.kernel.org/all/20260528015913.190970-1-zhaochenguang@kylinos.cn/

v1:
 https://lore.kernel.org/all/20260522054521.1169755-1-zhaochenguang@kylinos.cn/
---
 net/netlabel/netlabel_unlabeled.c | 30 ++++++++++--------------------
 1 file changed, 10 insertions(+), 20 deletions(-)

diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c
index ca7a9e2a3de7..870e7699326a 100644
--- a/net/netlabel/netlabel_unlabeled.c
+++ b/net/netlabel/netlabel_unlabeled.c
@@ -114,14 +114,14 @@ static struct genl_family netlbl_unlabel_gnl_family;
 /* NetLabel Netlink attribute policy */
 static const struct nla_policy netlbl_unlabel_genl_policy[NLBL_UNLABEL_A_MAX + 1] = {
 	[NLBL_UNLABEL_A_ACPTFLG] = { .type = NLA_U8 },
-	[NLBL_UNLABEL_A_IPV6ADDR] = { .type = NLA_BINARY,
-				      .len = sizeof(struct in6_addr) },
-	[NLBL_UNLABEL_A_IPV6MASK] = { .type = NLA_BINARY,
-				      .len = sizeof(struct in6_addr) },
-	[NLBL_UNLABEL_A_IPV4ADDR] = { .type = NLA_BINARY,
-				      .len = sizeof(struct in_addr) },
-	[NLBL_UNLABEL_A_IPV4MASK] = { .type = NLA_BINARY,
-				      .len = sizeof(struct in_addr) },
+	[NLBL_UNLABEL_A_IPV6ADDR] =
+		NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)),
+	[NLBL_UNLABEL_A_IPV6MASK] =
+		NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)),
+	[NLBL_UNLABEL_A_IPV4ADDR] =
+		NLA_POLICY_EXACT_LEN(sizeof(struct in_addr)),
+	[NLBL_UNLABEL_A_IPV4MASK] =
+		NLA_POLICY_EXACT_LEN(sizeof(struct in_addr)),
 	[NLBL_UNLABEL_A_IFACE] = { .type = NLA_NUL_STRING,
 				   .len = IFNAMSIZ - 1 },
 	[NLBL_UNLABEL_A_SECCTX] = { .type = NLA_BINARY }
@@ -757,24 +757,14 @@ static int netlbl_unlabel_addrinfo_get(struct genl_info *info,
 				       void **mask,
 				       u32 *len)
 {
-	u32 addr_len;
-
 	if (info->attrs[NLBL_UNLABEL_A_IPV4ADDR] &&
 	    info->attrs[NLBL_UNLABEL_A_IPV4MASK]) {
-		addr_len = nla_len(info->attrs[NLBL_UNLABEL_A_IPV4ADDR]);
-		if (addr_len != sizeof(struct in_addr) &&
-		    addr_len != nla_len(info->attrs[NLBL_UNLABEL_A_IPV4MASK]))
-			return -EINVAL;
-		*len = addr_len;
+		*len = sizeof(struct in_addr);
 		*addr = nla_data(info->attrs[NLBL_UNLABEL_A_IPV4ADDR]);
 		*mask = nla_data(info->attrs[NLBL_UNLABEL_A_IPV4MASK]);
 		return 0;
 	} else if (info->attrs[NLBL_UNLABEL_A_IPV6ADDR]) {
-		addr_len = nla_len(info->attrs[NLBL_UNLABEL_A_IPV6ADDR]);
-		if (addr_len != sizeof(struct in6_addr) &&
-		    addr_len != nla_len(info->attrs[NLBL_UNLABEL_A_IPV6MASK]))
-			return -EINVAL;
-		*len = addr_len;
+		*len = sizeof(struct in6_addr);
 		*addr = nla_data(info->attrs[NLBL_UNLABEL_A_IPV6ADDR]);
 		*mask = nla_data(info->attrs[NLBL_UNLABEL_A_IPV6MASK]);
 		return 0;
-- 
2.25.1


^ permalink raw reply related

* [PATCH v4 0/2] landlock: fix SCOPE_SIGNAL bypass on the SIGIO/fowner path
From: Bryam Vargas @ 2026-06-02 17:27 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Justin Suess, Christian Brauner, Paul Moore, James Morris,
	Serge E . Hallyn, linux-security-module, stable, linux-kernel
In-Reply-To: <20260531.irah0eiM3Chi@digikod.net>

This series fixes a LANDLOCK_SCOPE_SIGNAL bypass on the asynchronous SIGIO
(fcntl(F_SETOWN)) delivery path, and adds a regression test.

A sandboxed process that owns a file or socket can request a signal
(F_SETSIG, e.g. SIGKILL) to be delivered to a whole process group on I/O
readiness (F_SETOWN(-pgid) + O_ASYNC).  When it is the head of its own
process group -- the default after fork() -- that group still contains the
non-sandboxed process that launched it (a supervisor, a security monitor),
so the sandbox can signal processes that SCOPE_SIGNAL is meant to protect
from it.

Patch 1 narrows the same-thread-group exemption in control_current_fowner()
so a process-group fowner always records the caller's Landlock domain; the
delivery-time check in hook_file_send_sigiotask() then runs against every
group member.  The direct kill() path (hook_task_kill) is unaffected.

Patch 2 adds the regression test in scoped_signal_test.c.

The defect was introduced by commit 18eb75f3af40 ("landlock: Always allow
signals between threads of the same process") in v6.15, and is present in the
stable branches that backported it (6.12.y, 6.13.y, 6.14.y).
control_current_fowner() is identical across those branches.

A/B verified on 6.12.90 + CONFIG_SECURITY_LANDLOCK (same .config, only the fix
hunk differs): without patch 1 the new test fails (the non-sandboxed parent is
signaled, SCOPE_SIGNAL bypassed); with patch 1 the new test passes and the
landlock signal-scoping suite is 20/20.

v3 -> v4 (review feedback from Mickaël Salaün):
  - patch 1: rewrite the commit message -- drop the "cache" framing, lead with
    the threat scenario, mostly "why" and minimal "what", "process" not "task";
  - patch 1: drop PIDTYPE_SID (not possible for an fowner) and use the defensive
    condition "!= PIDTYPE_PID && != PIDTYPE_TGID"; simplify the in-code comment;
  - patch 1: remove Reported-by (implicit with the same Signed-off-by);
  - send as a proper git send-email threaded series.
  - v1/v2 were sent to security@kernel.org (embargoed; not in a public archive).

Bryam Vargas (2):
  landlock: fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
  selftests/landlock: test SCOPE_SIGNAL on the SIGIO/fowner pgid path

 security/landlock/fs.c                        |  9 ++
 .../selftests/landlock/scoped_signal_test.c   | 97 +++++++++++++++++++
 2 files changed, 106 insertions(+)

base-commit: 6f3ed7fec72fc8979b2a8c7219c0a9fcfc8d07b5
-- 
2.43.0

^ permalink raw reply

* [PATCH v4 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
From: Bryam Vargas @ 2026-06-02 17:27 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Justin Suess, Christian Brauner, Paul Moore, James Morris,
	Serge E . Hallyn, linux-security-module, stable, linux-kernel
In-Reply-To: <20260602172741.18760-1-hexlabsecurity@proton.me>

LANDLOCK_SCOPE_SIGNAL must prevent a sandboxed process from signaling
processes outside its Landlock domain.  It can be bypassed through the
asynchronous SIGIO delivery path.

A sandboxed process that owns any file or socket can arm it with
fcntl(F_SETOWN, fd, -pgid), fcntl(F_SETSIG, fd, SIGKILL) and O_ASYNC, so
that an I/O event makes the kernel deliver the chosen signal to the whole
process group.  As the head of its own process group -- the default right
after fork() -- that group also holds the non-sandboxed process that
launched it, e.g. a supervisor or a security monitor.  The sandbox can
thus kill or repeatedly signal exactly the processes SCOPE_SIGNAL is meant
to protect from it.

The scope is enforced in hook_file_send_sigiotask() against the Landlock
domain recorded at F_SETOWN time, not the live domain of the sender.
control_current_fowner() decides whether to record that domain and skips
recording it when the fowner target is in the caller's thread group --
safe only when the target is a single process sharing the caller's
credentials (PIDTYPE_PID, PIDTYPE_TGID).  For a process group
(PIDTYPE_PGID) the target resolves to the caller itself when it is the
group head, recording is skipped, and hook_file_send_sigiotask() then lets
the signal fan out to the whole group unchecked.

Skip the recording only for the single-process target types, so the scope
is enforced against every group member at delivery time.  The direct
kill() path (hook_task_kill) already evaluates the live domain and is
unaffected.

Fixes: 18eb75f3af40 ("landlock: Always allow signals between threads of the same process")
Cc: stable@vger.kernel.org
Tested-by: Justin Suess <utilityemal77@gmail.com>
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
 security/landlock/fs.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index c1ecfe239032..2ebad70a956d 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -1909,6 +1909,15 @@ static bool control_current_fowner(struct fown_struct *const fown)
 	if (!p)
 		return true;

+	/*
+	 * A process-group fowner fans the signal out to every member at
+	 * delivery time, so record the domain for any non single-process
+	 * target -- even when it resolves to current as the group head --
+	 * and let hook_file_send_sigiotask() check the live scope.
+	 */
+	if (fown->pid_type != PIDTYPE_PID && fown->pid_type != PIDTYPE_TGID)
+		return true;
+
 	return !same_thread_group(p, current);
 }

-- 
2.43.0

^ permalink raw reply related

* [PATCH v4 2/2] selftests/landlock: test SCOPE_SIGNAL on the SIGIO/fowner pgid path
From: Bryam Vargas @ 2026-06-02 17:28 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Justin Suess, Christian Brauner, Paul Moore, James Morris,
	Serge E . Hallyn, linux-security-module, stable, linux-kernel
In-Reply-To: <20260602172741.18760-1-hexlabsecurity@proton.me>

Add a regression test for the LANDLOCK_SCOPE_SIGNAL bypass on the
asynchronous SIGIO delivery path.  A sandboxed process that owns a file
via fcntl(F_SETOWN, -pgrp) while sitting at the head of its process
group's PID hlist (the default position after fork()) used to have its
Landlock domain recording skipped, letting the SIGIO fan-out reach
non-sandboxed members of the process group.

The test creates a dedicated process group, sandboxes the (hlist-head)
child with LANDLOCK_SCOPE_SIGNAL, arms F_SETSIG(SIGURG) / F_SETOWN(-pgrp)
/ O_ASYNC on a pipe and triggers the fan-out.  The in-domain child must
receive the signal (proving the trigger fired); the non-sandboxed parent,
which is outside the child's domain, must not.  Without the fix the parent
is signaled and the test fails.

Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
 .../selftests/landlock/scoped_signal_test.c   | 97 +++++++++++++++++++
 1 file changed, 97 insertions(+)

diff --git a/tools/testing/selftests/landlock/scoped_signal_test.c b/tools/testing/selftests/landlock/scoped_signal_test.c
index d8bf33417619..62d86a115775 100644
--- a/tools/testing/selftests/landlock/scoped_signal_test.c
+++ b/tools/testing/selftests/landlock/scoped_signal_test.c
@@ -559,4 +559,101 @@ TEST_F(fown, sigurg_socket)
 		_metadata->exit_code = KSFT_FAIL;
 }
 
+/*
+ * Checks that LANDLOCK_SCOPE_SIGNAL is enforced on the asynchronous SIGIO
+ * delivery path (fcntl(F_SETOWN)) when the file owner is a process group.
+ *
+ * A sandboxed process sitting at the head of its process group's PID hlist
+ * (the default position right after fork()) used to escape the
+ * fcntl(F_SETOWN, -pgrp) domain recording: pid_task(pgrp, PIDTYPE_PGID)
+ * resolved to the process itself, so the same-thread-group exemption skipped
+ * recording its Landlock domain.  At SIGIO time that domain was then unset
+ * and the signal fanned out to every group member, including non-sandboxed
+ * processes outside the domain.
+ */
+TEST(sigio_to_pgid_members)
+{
+	int trigger[2], sync_child[2];
+	char buf;
+	pid_t child;
+	int status, i;
+
+	drop_caps(_metadata);
+
+	/*
+	 * Isolates the test in its own process group so the SIGIO fan-out
+	 * stays bounded to this parent and the child forked below.
+	 */
+	ASSERT_EQ(0, setpgid(0, 0));
+
+	/* The non-sandboxed parent is the protected (out-of-domain) target. */
+	ASSERT_EQ(0, setup_signal_handler(SIGURG));
+	signal_received = 0;
+
+	ASSERT_EQ(0, pipe2(trigger, O_CLOEXEC));
+	ASSERT_EQ(0, pipe2(sync_child, O_CLOEXEC));
+
+	child = fork();
+	ASSERT_LE(0, child);
+	if (child == 0) {
+		/*
+		 * The child inherits the parent's new process group and, just
+		 * attached with hlist_add_head_rcu(), is now the head of the
+		 * pgid hlist: this is the case that used to skip the recording.
+		 */
+		EXPECT_EQ(0, close(sync_child[0]));
+
+		/* In-domain positive control: the child must be signaled. */
+		ASSERT_EQ(0, setup_signal_handler(SIGURG));
+		signal_received = 0;
+
+		create_scoped_domain(_metadata, LANDLOCK_SCOPE_SIGNAL);
+
+		/* Owns the SIGIO source for the whole process group. */
+		ASSERT_EQ(0, fcntl(trigger[0], F_SETSIG, SIGURG));
+		ASSERT_EQ(0, fcntl(trigger[0], F_SETOWN, -getpgrp()));
+		ASSERT_EQ(0, fcntl(trigger[0], F_SETFL, O_ASYNC));
+
+		/* Fans SIGURG out to every member of the process group. */
+		ASSERT_EQ(1, write(trigger[1], ".", 1));
+
+		/*
+		 * The sandboxed child is in its own domain and must always be
+		 * signaled: this proves the SIGIO actually fired.
+		 */
+		for (i = 0; i < 1000 && !signal_received; i++)
+			usleep(1000);
+		EXPECT_EQ(1, signal_received);
+
+		ASSERT_EQ(1, write(sync_child[1], ".", 1));
+		EXPECT_EQ(0, close(sync_child[1]));
+
+		_exit(_metadata->exit_code);
+		return;
+	}
+	EXPECT_EQ(0, close(sync_child[1]));
+	EXPECT_EQ(0, close(trigger[0]));
+	EXPECT_EQ(0, close(trigger[1]));
+
+	/* Waits for the child to generate the SIGIO. */
+	ASSERT_EQ(1, read(sync_child[0], &buf, 1));
+	EXPECT_EQ(0, close(sync_child[0]));
+
+	/* Lets a delivered-but-pending signal run our handler, if any. */
+	for (i = 0; i < 100 && !signal_received; i++)
+		usleep(1000);
+
+	/*
+	 * SCOPE_SIGNAL must block the fan-out to this non-sandboxed parent,
+	 * which is outside the child's Landlock domain.  Before the fix the
+	 * parent was signaled here.
+	 */
+	EXPECT_EQ(0, signal_received);
+
+	ASSERT_EQ(child, waitpid(child, &status, 0));
+	if (WIFSIGNALED(status) || !WIFEXITED(status) ||
+	    WEXITSTATUS(status) != EXIT_SUCCESS)
+		_metadata->exit_code = KSFT_FAIL;
+}
+
 TEST_HARNESS_MAIN
-- 
2.43.0



^ permalink raw reply related

* Re: [PATCH v5 7/8] vfs: Replace security_sb_mount/security_move_mount with granular hooks
From: Song Liu @ 2026-06-03 15:42 UTC (permalink / raw)
  To: brauner
  Cc: paul, jmorris, serge, viro, jack, john.johansen,
	stephen.smalley.work, omosnace, mic, gnoack, takedakn,
	penguin-kernel, herton, kernel-team, linux-security-module,
	linux-fsdevel, apparmor, selinux
In-Reply-To: <20260528182607.3150386-8-song@kernel.org>

Hi Christian,

On Thu, May 28, 2026 at 11:26 AM Song Liu <song@kernel.org> wrote:
>
> Replace the monolithic security_sb_mount() call in path_mount() and
> security_move_mount() in vfs_move_mount() with the new granular mount
> hooks:
>
> - do_loopback(): call security_mount_bind()
> - do_new_mount(): call security_mount_new()
> - do_remount(): call security_mount_remount()
> - do_reconfigure_mnt(): call security_mount_reconfigure()
> - do_move_mount_old(): call security_mount_move()
> - do_change_type(): call security_mount_change_type()
> - vfs_move_mount(): replace security_move_mount() with
>   security_mount_move()

Does this version look good to you?

Thanks,
Song

^ permalink raw reply

* Re: [PATCH v4 0/2] Delete task_euid()
From: Paul Moore @ 2026-06-03 16:04 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman, Shuah Khan,
	Alex Shi, Yanteng Si, Dongliang Mu, Miguel Ojeda, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Danilo Krummrich, Jann Horn, linux-security-module,
	linux-doc, linux-kernel, rust-for-linux
In-Reply-To: <ah51DY5yfaNZejBd@google.com>

On Tue, Jun 2, 2026 at 2:15 AM Alice Ryhl <aliceryhl@google.com> wrote:
> On Mon, Jun 01, 2026 at 07:13:37PM -0400, Paul Moore wrote:
> > On Fri, May 29, 2026 at 5:33 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > >
> > > The task_euid() method is a very weird method, and Binder was the only
> > > user. As of commit 65b672152289 ("binder: use current_euid() for
> > > transaction sender identity") Binder doesn't use task_euid() anymore,
> > > so we can delete this method.
> >
> > Given the problems from last time, it seems like it might be prudent
> > to let the commit have some time to "breathe" in a proper release, I'd
> > suggest merging this not for the upcoming v7.2 merge window but
> > instead waiting for v7.3.
>
> Sure, that makes sense. I'll resend after the merge window.

No need to resend if there are no changes (see below), it's in
patchwork and I'm tracking it so you're all set.  I'll send another
notice when I merge it.

> > > My suggestion would be to merge this through the LSM tree.
> >
> > That's fine with me.  I'd also suggest updating the commit description
> > in patch 1/2 to indicate that binder is no longer using task_euid();
> > it currently reads like it is still being used.
>
> I guess this occurred because when patch 1 was written, it really *was*
> still being used.

Yeah, I understand the world has changed since patch 1/2 was written,
which is okay, we just need to update the commit description ... which
should be a trivial task.

> Perhaps we could pick up only patch 1 now since even
> if we run into problems and Binder has to go back to using task_euid(),
> clarifying the docs is still useful.

I assumed that was one of the reasons for splitting the changes across
two patches (reverting patch 2/2 leaves patch 1/2 intact).
Regardless, we're at -rc6 and with patch 1/2 being purely a comment
update I don't see an urgent rush on this, especially considering that
if I did pick it up now, it would be for the v7.2 merge window and the
binder/current_euid() change will ship in v7.1.

Let's update the commit description - you've got a couple of weeks to
do that - and then we'll merge everything once the v7.2 merge window
closes.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v4 0/2] Delete task_euid()
From: Alice Ryhl @ 2026-06-03 17:05 UTC (permalink / raw)
  To: Paul Moore
  Cc: Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman, Shuah Khan,
	Alex Shi, Yanteng Si, Dongliang Mu, Miguel Ojeda, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Danilo Krummrich, Jann Horn, linux-security-module,
	linux-doc, linux-kernel, rust-for-linux
In-Reply-To: <CAHC9VhQyNzxJgdMkmEsOeAQ7Wt2L+eW6aNLjeoYmnCQLmcYRnw@mail.gmail.com>

On Wed, Jun 3, 2026 at 6:05 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Tue, Jun 2, 2026 at 2:15 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > On Mon, Jun 01, 2026 at 07:13:37PM -0400, Paul Moore wrote:
> > > On Fri, May 29, 2026 at 5:33 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > > >
> > > > The task_euid() method is a very weird method, and Binder was the only
> > > > user. As of commit 65b672152289 ("binder: use current_euid() for
> > > > transaction sender identity") Binder doesn't use task_euid() anymore,
> > > > so we can delete this method.
> > >
> > > Given the problems from last time, it seems like it might be prudent
> > > to let the commit have some time to "breathe" in a proper release, I'd
> > > suggest merging this not for the upcoming v7.2 merge window but
> > > instead waiting for v7.3.
> >
> > Sure, that makes sense. I'll resend after the merge window.
>
> No need to resend if there are no changes (see below), it's in
> patchwork and I'm tracking it so you're all set.  I'll send another
> notice when I merge it.
>
> > > > My suggestion would be to merge this through the LSM tree.
> > >
> > > That's fine with me.  I'd also suggest updating the commit description
> > > in patch 1/2 to indicate that binder is no longer using task_euid();
> > > it currently reads like it is still being used.
> >
> > I guess this occurred because when patch 1 was written, it really *was*
> > still being used.
>
> Yeah, I understand the world has changed since patch 1/2 was written,
> which is okay, we just need to update the commit description ... which
> should be a trivial task.
>
> > Perhaps we could pick up only patch 1 now since even
> > if we run into problems and Binder has to go back to using task_euid(),
> > clarifying the docs is still useful.
>
> I assumed that was one of the reasons for splitting the changes across
> two patches (reverting patch 2/2 leaves patch 1/2 intact).
> Regardless, we're at -rc6 and with patch 1/2 being purely a comment
> update I don't see an urgent rush on this, especially considering that
> if I did pick it up now, it would be for the v7.2 merge window and the
> binder/current_euid() change will ship in v7.1.
>
> Let's update the commit description - you've got a couple of weeks to
> do that - and then we'll merge everything once the v7.2 merge window
> closes.

Sounds good, thanks!

Alice

^ permalink raw reply

* Re: -next status as at v7.1-rc6
From: Paul Moore @ 2026-06-04  0:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-security-module, Mark Brown, Blaise Boscaccy,
	Alexei Starovoitov, linux-next, linux-kernel
In-Reply-To: <CAHk-=wj6CtZS9hbwFjQcoNkPwQLoyKmk8czaBF6=bBOCYuXEUQ@mail.gmail.com>

On Tue, Jun 2, 2026 at 4:20 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, 1 Jun 2026 at 11:22, Mark Brown <broonie@kernel.org> wrote:
> >
> > We're reaching the peak of conflicts for this release cycle but
> > fundamentally everything is pretty quiet at the minute.
>
> So the "Hornet" LSM thing needs to bve removed from linux-next.
>
> I'm not ever going to pull it - it has been NAK'ed by the developers
> of the actual BPF code, and now it apparently causes merge problems
> too.

It's worth mentioning that resolving the merge issue was relatively
straightforward and we had a tested patch ready in a few hours; I was
in the process of merging it when your mail hit my inbox.  We see
cross-subsystem issues like this every couple of months. The good news
is that the linux-next process works well and we are usually able to
resolve the problems quickly.

I'll touch on the NACK below.

> The LSM people need to realize that they cannot override the people
> who actually write the real code.
>
> The security layer is not boss in the relationship. It's the
> subservient party. Security is important, but LSM's are not.

[NOTE: adding the LSM list to the CC line since I just realized you
didn't include it]

At this point I believe we are all aware of your dislike of LSMs, but
I once again feel compelled to speak out against the disrespect you
have shown towards the LSM developers and users.  Most (all?) of the
major Linux distributions rely on at least one LSM to meet the needs
of their users.  We've seen the importance of LSMs to Android, and the
real impact they have had on defending against vulnerabilities.  We
also know, oddly enough in this particular case, the importance of the
LSM framework to the BPF ecosystem; the BPF LSM attach points are
critical for many BPF use cases.

LSMs may not be important to you, but they are very important for a
very large number of Linux users.

> When the maintainer of a codebase NAK's a security model, and explains
> that the code has different needs and different security models, the
> LSM people don't just ignore that and go do their thing despite the
> NAK.

I think there may be some confusion about how Hornet works and
interacts with the existing BPF subsystem.  Hornet does not touch any
code inside the BPF subsystem, and outside of some minor PKCS7 patches
to enable some things from the PKCS specs, it doesn't touch any code
outside of security/.  Hornet's design criteria required it to work
within the existing LSM hooks and remain compatible with the existing
BPF signature verification code.  Hornet follows the traditional LSM
design pattern: it works within the LSM framework, building upon the
security functionality in other subsystems to satisfy user security
needs that have otherwise been ignored.

A BPF light skeleton signed with the Hornet tools passes both the
existing BPF signature verification code and Hornet's verification.
Hornet allows signature verification using arbitrary keys/keyrings,
just like the existing BPF signature code, enabling support for
dynamically generated/signed BPF programs.  In mixed environments,
Hornet can verify the loader portion of BPF light skeletons signed
with the existing BPF signing tools, distinguishing between existing
Alexei/KP signatures and a full Hornet signature.  Hornet also
supports unsigned BPF programs when needed.

As an LSM, Hornet can be enabled/disabled at build time, kernel boot,
and at runtime (subject to the LSM providing enforcement).  If a user
is required to run a specific kernel build, e.g. an "enterprise" Linux
distro support situation, the admin has multiple ways to disable
Hornet if it is not desired.  Any Hornet signed BPF light skeletons
will load without issue on a system without Hornet and Hornet's
presence does not block the existing BPF signature verification
mechanisms.

The BPF developers will quickly point out, if they haven't mentioned
it to you already, that Hornet calls into a 'bpf_map_ops' function.
This is true, Hornet calls the map_get_hash() method to get the hash
of a BPF map so it can be verified.  The BPF devs will argue this
presents a layering violation, but I would counter that several users
in the networking stack go much further than Hornet in their use of
'bpf_map_ops'; I find it difficult to see 'bpf_map_ops' as a private
API at this point.  It is also important to mention that Hornet does
not manipulate or modify the BPF program or map state in any way
beyond storing some state in a Hornet specific LSM blob, similar to
what other LSMs have done.

We've done this work, and tried collaborating with the BPF devs for
over a year, because we believe the kernel should verify the integrity
of both the BPF light skeleton loader and the associated maps.  While
the existing BPF signature verification scheme verifies the light
skeleton loader, it requires the loader to self-verify its associated
maps.  Relying on the loader to self-verify maps is problematic
because it adds an additional burden on system builders and admins who
must now also manage and verify the signature verification in every
BPF light skeleton loader on the system (how does one know if a
specific light skeleton loader suffers from a verification bug? what
additional steps need to be added to the deployment of third-party,
binary only BPF light skeletons?).  At the very least Hornet can
ensure the integrity of both the BPF light skeleton loader and the
signed maps without requiring prior analysis of the light skeleton
loader.  That is a big win for admins who care about what code is
loaded into their kernel.

Hornet lives entirely within the LSM framework, adds no additional LSM
hooks, remains compatible with the existing BPF signature verification
code, can be disabled in multiple ways if required, provides the
verification flexibility needed to support the existing BPF ecosystem,
and helps satisfy the needs of real users.  Why can't we support
Hornet alongside the existing BPF signature verification code and let
the users employ the mechanism that works best for them?

--
paul-moore.com

^ permalink raw reply

* Re: [PATCH v6 11/12] ima: Support staging and deleting N measurements records
From: steven chen @ 2026-06-04  0:25 UTC (permalink / raw)
  To: Roberto Sassu, corbet, skhan, zohar, dmitry.kasatkin,
	eric.snowberg, paul, jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, nramas, Roberto Sassu, steven chen
In-Reply-To: <20260602111401.1706052-12-roberto.sassu@huaweicloud.com>

On 6/2/2026 4:14 AM, Roberto Sassu wrote:
> From: Roberto Sassu <roberto.sassu@huawei.com>
>
> Add support for sending a value N between 1 and ULONG_MAX to the IMA
> original measurement interface. This value represents the number of
> measurements that should be deleted from the current measurements list. In
> this case, measurements are staged in an internal non-user visible list,
> and immediately deleted.
>
> This staging method allows the remote attestation agents to easily separate
> the measurements that were verified (staged and deleted) from those that
> weren't due to the race between taking a TPM quote and reading the
> measurements list.
>
> In order to minimize the locking time of ima_extend_list_mutex, deleting
> N records is realized by doing a lockless walk in the current measurements
> list to determine the N-th entry to cut, to cut the current measurements
> list under the lock, and by deleting the excess records after releasing the
> lock.
>
> Flushing the hash table is not supported for N records, since it would
> require removing the N records one by one from the hash table under the
> ima_extend_list_mutex lock, which would increase the locking time.
>
> Link: https://github.com/linux-integrity/linux/issues/1
> Co-developed-by: Steven Chen <chenste@linux.microsoft.com>

Signed-off-by: Steven Chen <chenste@linux.microsoft.com>

> Co-developed-by: Roberto Sassu <roberto.sassu@huawei.com>
> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
> ---
>   security/integrity/ima/Kconfig     |  3 ++
>   security/integrity/ima/ima.h       |  1 +
>   security/integrity/ima/ima_fs.c    | 32 +++++++++++++--
>   security/integrity/ima/ima_queue.c | 63 ++++++++++++++++++++++++++++++
>   4 files changed, 96 insertions(+), 3 deletions(-)
>
> diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
> index 02436670f746..f4d25e045808 100644
> --- a/security/integrity/ima/Kconfig
> +++ b/security/integrity/ima/Kconfig
> @@ -341,6 +341,9 @@ config IMA_STAGING
>   	  It allows user space to stage the measurements list for deletion and
>   	  to delete the staged measurements after confirmation.
>   
> +	  Or, alternatively, it allows user space to specify N measurements
> +	  records to stage internally, so that they can be immediately deleted.
> +
>   	  On kexec, staging is aborted and any staged measurement records are
>   	  copied to the secondary kernel.
>   
> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> index d2e740c8ff75..7a1b2d6a8b59 100644
> --- a/security/integrity/ima/ima.h
> +++ b/security/integrity/ima/ima.h
> @@ -320,6 +320,7 @@ struct ima_template_desc *lookup_template_desc(const char *name);
>   bool ima_template_has_modsig(const struct ima_template_desc *ima_template);
>   int ima_queue_stage(void);
>   int ima_queue_staged_delete_all(void);
> +int ima_queue_delete_partial(unsigned long req_value);
>   int ima_restore_measurement_entry(struct ima_template_entry *entry);
>   int ima_restore_measurement_list(loff_t bufsize, void *buf);
>   int ima_measurements_show(struct seq_file *m, void *v);
> diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
> index 96d7503a605b..174a94740da1 100644
> --- a/security/integrity/ima/ima_fs.c
> +++ b/security/integrity/ima/ima_fs.c
> @@ -28,6 +28,7 @@
>    * Requests:
>    * 'A\n': stage the entire measurements list
>    * 'D\n': delete all staged measurements
> + * '[1, ULONG_MAX]\n' delete N measurements records
>    */
>   #define STAGED_REQ_LENGTH 21
>   
> @@ -343,6 +344,7 @@ static ssize_t _ima_measurements_write(struct file *file,
>   				       loff_t *ppos, bool staged_interface)
>   {
>   	char req[STAGED_REQ_LENGTH];
> +	unsigned long req_value;
>   	int ret;
>   
>   	if (datalen < 2 || datalen > STAGED_REQ_LENGTH)
> @@ -370,7 +372,24 @@ static ssize_t _ima_measurements_write(struct file *file,
>   		ret = ima_queue_staged_delete_all();
>   		break;
>   	default:
> -		ret = -EINVAL;
> +		if (staged_interface)
> +			return -EINVAL;
> +
> +		if (ima_flush_htable) {
> +			pr_debug("Deleting staged N measurements not supported when flushing the hash table is requested\n");
> +			return -EINVAL;
> +		}
> +
> +		ret = kstrtoul(req, 10, &req_value);
> +		if (ret < 0)
> +			return ret;
> +
> +		if (req_value == 0) {
> +			pr_debug("Must delete at least one entry\n");
> +			return -EINVAL;
> +		}
> +
> +		ret = ima_queue_delete_partial(req_value);
>   	}
>   
>   	if (ret < 0)
> @@ -379,6 +398,12 @@ static ssize_t _ima_measurements_write(struct file *file,
>   	return datalen;
>   }
>   
> +static ssize_t ima_measurements_write(struct file *file, const char __user *buf,
> +				      size_t datalen, loff_t *ppos)
> +{
> +	return _ima_measurements_write(file, buf, datalen, ppos, false);
> +}
> +
>   static ssize_t ima_measurements_staged_write(struct file *file,
>   					     const char __user *buf,
>   					     size_t datalen, loff_t *ppos)
> @@ -389,6 +414,7 @@ static ssize_t ima_measurements_staged_write(struct file *file,
>   static const struct file_operations ima_measurements_ops = {
>   	.open = ima_measurements_open,
>   	.read = seq_read,
> +	.write = ima_measurements_write,
>   	.llseek = seq_lseek,
>   	.release = ima_measurements_release,
>   };
> @@ -470,6 +496,7 @@ static int ima_ascii_measurements_open(struct inode *inode, struct file *file)
>   static const struct file_operations ima_ascii_measurements_ops = {
>   	.open = ima_ascii_measurements_open,
>   	.read = seq_read,
> +	.write = ima_measurements_write,
>   	.llseek = seq_lseek,
>   	.release = ima_measurements_release,
>   };
> @@ -603,14 +630,13 @@ static int __init create_securityfs_measurement_lists(bool staging)
>   {
>   	const struct file_operations *ascii_ops = &ima_ascii_measurements_ops;
>   	const struct file_operations *binary_ops = &ima_measurements_ops;
> -	umode_t permissions = (S_IRUSR | S_IRGRP);
> +	umode_t permissions = (S_IRUSR | S_IRGRP | S_IWUSR | S_IWGRP);
>   	const char *file_suffix = "";
>   	int count = NR_BANKS(ima_tpm_chip);
>   
>   	if (staging) {
>   		ascii_ops = &ima_ascii_measurements_staged_ops;
>   		binary_ops = &ima_measurements_staged_ops;
> -		permissions |= (S_IWUSR | S_IWGRP);
>   		file_suffix = "_staged";
>   	}
>   
> diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
> index af0502f27d57..718991ba8bcd 100644
> --- a/security/integrity/ima/ima_queue.c
> +++ b/security/integrity/ima/ima_queue.c
> @@ -405,6 +405,69 @@ int ima_queue_staged_delete_all(void)
>   	return 0;
>   }
>   
> +/**
> + * ima_queue_delete_partial - Delete current measurements
> + * @req_value: Number of measurements to delete
> + *
> + * Delete the requested number of measurements from the current measurements
> + * list, and update the number of records and the binary run-time size
> + * accordingly.
> + *
> + * Refuse to delete current measurements if measurement is suspended, so that
> + * dump can be done in a lockless way and user space is notified about current
> + * measurements being carried over to the secondary kernel, so that it does not
> + * save them twice.
> + *
> + * Return: Zero on success, a negative value otherwise.
> + */
> +int ima_queue_delete_partial(unsigned long req_value)
> +{
> +	unsigned long req_value_copy = req_value;
> +	unsigned long size_to_remove = 0, num_to_remove = 0;
> +	LIST_HEAD(ima_measurements_trim);
> +	struct ima_queue_entry *qe;
> +	int ret = 0;
> +
> +	/*
> +	 * list_for_each_entry_rcu() without rcu_read_lock() is fine because
> +	 * only list append can happen concurrently. No list replace due to the
> +	 * staging/delete writers mutual exclusion.
> +	 */
> +	list_for_each_entry_rcu(qe, &ima_measurements, later, true) {
> +		size_to_remove += get_binary_runtime_size(qe->entry);
> +		num_to_remove++;
> +
> +		if (--req_value_copy == 0)
> +			break;
> +	}
> +
> +	/* Not enough records to delete. */
> +	if (req_value_copy > 0)
> +		return -ENOENT;
> +
> +	mutex_lock(&ima_extend_list_mutex);
> +	if (ima_measurements_suspended) {
> +		mutex_unlock(&ima_extend_list_mutex);
> +		return -ESTALE;
> +	}
> +
> +	/*
> +	 * qe remains valid because ima_fs.c enforces single-writer exclusion.
> +	 */
> +	__list_cut_position(&ima_measurements_trim, &ima_measurements,
> +			    &qe->later);
> +
> +	atomic_long_sub(num_to_remove, &ima_num_records[BINARY]);
> +
> +	if (IS_ENABLED(CONFIG_IMA_KEXEC))
> +		binary_runtime_size[BINARY] -= size_to_remove;
> +
> +	mutex_unlock(&ima_extend_list_mutex);
> +
> +	ima_queue_delete(&ima_measurements_trim, false);
> +	return ret;
> +}
> +
>   /**
>    * ima_queue_delete - Delete measurements
>    * @head: List head measurements are deleted from



^ permalink raw reply

* Re: -next status as at v7.1-rc6
From: Linus Torvalds @ 2026-06-04  0:31 UTC (permalink / raw)
  To: Paul Moore
  Cc: linux-security-module, Mark Brown, Blaise Boscaccy,
	Alexei Starovoitov, linux-next, linux-kernel
In-Reply-To: <CAHC9VhRX1=tx4+arzuH76kirYFFJozexa5L=41Bb2QbgWZXOkQ@mail.gmail.com>

On Wed, 3 Jun 2026 at 17:04, Paul Moore <paul@paul-moore.com> wrote:
>
> It's worth mentioning that resolving the merge issue was relatively
> straightforward and we had a tested patch ready in a few hours

This is not the reason I'm not going to pull it - the merge issue was
just the reminder I got about an earlier email that I had dropped on
the floor.

No, the reason I won't pull it is that the main developer I pull bpf
code NAK'ed it.

Now, I will cdertainly sometimes override maintainers, so it's not
like a NAK is always some final thing.

I don't _like_ overriding developers, but I'll do it when I feel it is
necessary to make forward progress.

But I also have to  feel people have been unnecessarily difficult, and
I have been extensively informed about the decision and I feel like I
can make a reasonable judgement on it.

So it happens, but it happens with my explicit understanding.

My tree is *not* some kind of "we are bypassing developers by sending
a pull request directly to Linus" tree.

NEVER is that the way things get done.

[ Yes, that too has happened, and I have done that unwittingly because
I didn't realize what was going on ]

So I will not pull this tree. End of story.

The way to get me to override developers is to make me aware of the
conflict and convince me that yes, something needs overriding - but it
is typically not very easy to do with active developers.

And honestly, I also have two+ decades of history of "LSM people
cannot agree on a single thing".

That is _literally_ why the LSM layer exists in the first place.

So when LSM people then disagree with _other_ developers, quite
frankly my immediate and visceral reaction then is "oh, these people
who have decades of history of not being able to even agree amongst
themselves are now disagreeing with outsiders too".

Put another way: LSM people have a  higher barrier to convince me that
I should take their disagreements seriously.

And no, I'm afraid that may not be entirely fair.  But "history of
being disagreeable" is a thing.

              Linus

^ permalink raw reply

* Re: [PATCH v4 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
From: Günther Noack @ 2026-06-04  8:10 UTC (permalink / raw)
  To: Bryam Vargas
  Cc: Mickaël Salaün, Günther Noack, Justin Suess,
	Christian Brauner, Paul Moore, James Morris, Serge E . Hallyn,
	linux-security-module, stable, linux-kernel
In-Reply-To: <20260602172741.18760-2-hexlabsecurity@proton.me>

Hello!

Thanks for the updated patch set!

On Tue, Jun 02, 2026 at 05:27:56PM +0000, Bryam Vargas wrote:
> LANDLOCK_SCOPE_SIGNAL must prevent a sandboxed process from signaling
> processes outside its Landlock domain.  It can be bypassed through the
> asynchronous SIGIO delivery path.
> 
> A sandboxed process that owns any file or socket can arm it with
> fcntl(F_SETOWN, fd, -pgid), fcntl(F_SETSIG, fd, SIGKILL) and O_ASYNC, so
> that an I/O event makes the kernel deliver the chosen signal to the whole
> process group.  As the head of its own process group -- the default right
> after fork() -- that group also holds the non-sandboxed process that
> launched it, e.g. a supervisor or a security monitor.  The sandbox can
> thus kill or repeatedly signal exactly the processes SCOPE_SIGNAL is meant
> to protect from it.
> 
> The scope is enforced in hook_file_send_sigiotask() against the Landlock
> domain recorded at F_SETOWN time, not the live domain of the sender.
> control_current_fowner() decides whether to record that domain and skips
> recording it when the fowner target is in the caller's thread group --
> safe only when the target is a single process sharing the caller's
> credentials (PIDTYPE_PID, PIDTYPE_TGID).  For a process group
> (PIDTYPE_PGID) the target resolves to the caller itself when it is the
> group head, recording is skipped, and hook_file_send_sigiotask() then lets
> the signal fan out to the whole group unchecked.
> 
> Skip the recording only for the single-process target types, so the scope
> is enforced against every group member at delivery time.  The direct
> kill() path (hook_task_kill) already evaluates the live domain and is
> unaffected.

Consider the following scenario:

 - Processes P1 and P2 are in the same process group
 - Threads T2.1 and T2.2 are part of P2.
   - T2.1 is the thread group leader of P2.
   - T2.2 is in a signal-scoped Landlock domain
 - T2.2 registers the SIGIO for the entire PGID
 - Someone writes to the FD, triggering the SIGIO mechanism

What I would expect in this scenario is:

 - T2.1 receives the SIGIO because it is the thread group leader for
   P2.  (SIGIO with PGID only sends to one thread per process)
 - It is OK for it to receive the signal because signals between
   sibling threads should be permitted.
 - No other threads receive SIGIO.

I believe the result after this patch is:

 - No threads receive the SIGIO at all.

This is because we have been setting T2.2's Landlock domain as the
"sending domain" for the hook_file_sigiotask(), and that hook does on
its own not do the "same_thread_group()" check, and the thread group
leader T2.1 is outside of the T2.2's Landlock domain.


To be clear, the patch is still obviously an improvement, given that
it fixes a bypass for the signaling policy; it just seems to block it
slightly too broadly in this corner scenario?

The scenario does not happen *much* in practice, because SIGIO is not
used much, and starting with 7.0, multithreaded processes should
ideally use TSYNC and have their threads all in the exact same
Landlock domain.  (Before TSYNC, this only affected the case where a
process was already(!) multithreaded at the time of Landlock
enforcement.)

I like the simplicity of this fix, but I'm afraid it does not do 100%
the correct thing.  (I have not tried it out though and I'm happy to
stand corrected if my analysis is wrong.)

The fix would be for a very fringe scenario only, where multiple
conditions come together:

- An already multithreaded process enforcing a Landlock policy
- Not using the TSYNC flag for it (since Linux 7.0)
- Using SIGIO
- Using SIGIO with signaling to a full PGID, including the current process
- SIGIO registration happens from a non-thread-leader thread
- That thread is in a signal-scoped Landlock domain

Mickaël, maybe you have some thoughts on the tradeoff?

> 
> Fixes: 18eb75f3af40 ("landlock: Always allow signals between threads of the same process")
> Cc: stable@vger.kernel.org
> Tested-by: Justin Suess <utilityemal77@gmail.com>
> Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
> ---
>  security/landlock/fs.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index c1ecfe239032..2ebad70a956d 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -1909,6 +1909,15 @@ static bool control_current_fowner(struct fown_struct *const fown)
>  	if (!p)
>  		return true;
>  
> +	/*
> +	 * A process-group fowner fans the signal out to every member at
> +	 * delivery time, so record the domain for any non single-process
> +	 * target -- even when it resolves to current as the group head --
> +	 * and let hook_file_send_sigiotask() check the live scope.
> +	 */
> +	if (fown->pid_type != PIDTYPE_PID && fown->pid_type != PIDTYPE_TGID)
> +		return true;
> +
>  	return !same_thread_group(p, current);
>  }
>  
> -- 
> 2.43.0

Thanks,
–Günther

P.S: The threaded mail is now in the right format.  Remaining nit
     though: By convention, new patchset versions are posted at the
     top (no Reply-To header in the cover letter), and this is what
     many maintainers filter for - it is easier to get maintainers
     attention when sticking to that convention.

^ permalink raw reply

* Re: [PATCH v4 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
From: Bryam Vargas @ 2026-06-04 10:27 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, Günther Noack, Justin Suess,
	Christian Brauner, Paul Moore, James Morris, Serge E . Hallyn,
	linux-security-module, stable, linux-kernel
In-Reply-To: <20260604.f1cb6ce9cd6b@gnoack.org>

Hi Günther,

> I believe the result after this patch is:
>  - No threads receive the SIGIO at all.
>
> This is because we have been setting T2.2's Landlock domain as the
> "sending domain" for the hook_file_sigiotask(), and that hook does on
> its own not do the "same_thread_group()" check [...]

Confirmed -- I traced the delivery path and your analysis holds.

For a PGID owner the signal is anchored per process on its thread-group
leader: a task is attached to pid->tasks[PIDTYPE_PGID] only in the
thread_group_leader() branch of copy_process(), so send_sigio()'s
do_each_pid_task(pid, PIDTYPE_PGID, p) walk visits exactly T2.1 for P2,
never the non-leader T2.2.  hook_file_send_sigiotask() then runs
domain_is_scoped(recorded T2.2 domain, T2.1's live domain, SIGNAL) and,
having no same_thread_group() exemption of its own (unlike
hook_task_kill()), denies it -- even though T2.1 and T2.2 share P2's
signal_struct and 18eb75f3af40 mandates that same-process delivery always
be allowed.  T2.1 is P2's only entry on the PGID list, so P2 receives
nothing.  You are right.

One thing worth putting on the record: this over-block is not introduced
by the patch.  In unpatched control_current_fowner() the PGID case already
resolves through pid_task(fown->pid, PIDTYPE_PGID), which returns an
arbitrary hlist head -- one representative leader.  Whenever that head is
outside the caller's thread group, the domain is already recorded today and
the same delivery-time denial of the registrant's own leader already fires.
The patch only makes domain recording for PGID unconditional, i.e. it turns
that order-dependent behaviour into a deterministic one while closing the
order-dependent bypass.  So the corner you describe is a pre-existing gap in
the delivery hook, not a regression in v4.

That points at the real root cause: same_thread_group is a *per-recipient*
property, but control_current_fowner() approximates it once, at F_SETOWN
time, against a single pid_task() representative.  hook_task_kill() gets
this right because it evaluates same_thread_group(p, current) live, per
actual recipient.  hook_file_send_sigiotask() is the SIGIO analogue but
delegates the whole thread-group decision to that one registration-time
check, which a PGID delivery set simply cannot be captured by.

So the fully-correct fix is to move the same-process exemption to delivery
time, keyed to the *registrant* rather than to current (at SIGIO time
current is the fd writer, not the task that armed F_SETOWN).  Concretely:
when hook_file_set_fowner() records the domain, also pin
get_pid(task_tgid(current)) in struct landlock_file_security; in
hook_file_send_sigiotask(), before domain_is_scoped(), return 0 when
task_tgid(tsk) == that recorded pid.  PGID owners still record the domain
(so P1 stays blocked -- the bypass fix), but the registrant's own process,
including T2.1, is always allowed -- restoring 18eb75f3af40 exactly.  The
new pid is taken/put in lockstep with fown_subject.domain under the same
file->f_owner->lock and freed in hook_file_free_security(); the equality
test follows neither pid, so there is no extra RCU surface.  Sketch:

    /* struct landlock_file_security */
    struct pid *fown_tg;   /* registrant's thread group; NULL if no domain */

    /* hook_file_set_fowner(), where fown_subject is recorded */
    fown_tg = get_pid(task_tgid(current));
    ...
    put_pid(landlock_file(file)->fown_tg);     /* release previous */
    landlock_file(file)->fown_tg = fown_tg;

    /* hook_file_free_security() */
    put_pid(landlock_file(file)->fown_tg);

    /* hook_file_send_sigiotask(), after the !subject->domain quick return */
    if (task_tgid(tsk) == landlock_file(fown->file)->fown_tg)
            return 0;   /* same process as the registrant: always allowed */

I do not see a correct fix that avoids recording the registrant's identity:
the registrant task is deliberately discarded after set_fowner (only its
domain is kept), and exempting on a shared *domain* instead would be
insecure -- sibling threads can hold different domains, and a different
process could share one.

> To be clear, the patch is still obviously an improvement [...] it just
> seems to block it slightly too broadly in this corner scenario?
> [...] Mickaël, maybe you have some thoughts on the tradeoff?

Agreed on both counts.  Mickaël -- two ways to land this:

  (a) keep v4 as is.  It closes the bypass; the residual same-process
      over-block is pre-existing, deterministic only under the stacked
      conditions Günther listed (already-multithreaded enforce, no TSYNC,
      SIGIO to a PGID that includes self, registered from a non-leader
      thread in a per-thread signal-scoped domain), and arguably tolerable.

  (b) v5 = v4 + the delivery-time exemption above.  Strictly more correct:
      it also closes the pre-existing delivery-hook gap and restores
      18eb75f3af40's same-process invariant, at the cost of one struct pid*
      in landlock_file_security.

I lean (b) -- it fixes the actual root cause rather than the one reachable
instance -- and I am happy to spin it (with an added selftest covering the
PGID-includes-self / non-leader-registrant case, A/B verified) or to hold at
v4 if you would rather keep the change minimal.  Your call on whether the
corner warrants the extra state.

> P.S: [...] new patchset versions are posted at the top (no Reply-To
>      header in the cover letter) [...]

Will do -- v5 (whichever option) goes out as a fresh top-level thread, no
In-Reply-To/Reply-To pointing back at this review.

Bryam

^ permalink raw reply

* [PATCH] keys: prevent slab cache merging for key_jar
From: Mohammed EL Kadiri @ 2026-06-04 12:50 UTC (permalink / raw)
  To: David Howells, Jarkko Sakkinen
  Cc: Paul Moore, James Morris, Serge E . Hallyn, Kees Cook,
	Vlastimil Babka, keyrings, linux-security-module, linux-hardening,
	linux-kernel, Mohammed EL Kadiri

The key_jar slab cache holds struct key objects containing cryptographic
keys, authentication tokens, and keyring linkage. This cache currently
lacks merge prevention, allowing the SLUB allocator to merge it with
other similarly-sized caches.

On a default Ubuntu 6.17.0-23-generic system, key_jar has 5 aliases,
meaning 5 unrelated object types share its slab pages. struct key is
224 bytes, placed in 256-byte slabs alongside biovec-16, maple_node,
ip6_dst_cache, task_delay_info, and kmalloc-256 users.

Cross-cache heap exploitation is a well-documented attack class
(CVE-2022-29582, CVE-2022-2588, CVE-2021-22555) where slab cache
merging enables type confusion between unrelated kernel objects. A
use-after-free in any subsystem sharing slab pages with key_jar could
allow an attacker to reclaim a freed slot as a struct key, or corrupt
an existing key through a dangling pointer to a different type.

Add SLAB_NO_MERGE to ensure key_jar receives dedicated slab pages,
eliminating cross-cache attacks targeting struct key. The memory
overhead is minimal: with 32 objects per slab page and typical key
usage bounded by system keyring size, the cost of dedicated pages is
negligible. There is zero performance impact on the allocation hot
path.

This follows the precedent set by skbuff_head_cache (net/core/skbuff.c)
which uses SLAB_NO_MERGE for similar isolation requirements.

Signed-off-by: Mohammed EL Kadiri <med08elkadiri@gmail.com>
---
 security/keys/key.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/keys/key.c b/security/keys/key.c
index 3bbdde778631..592b65cf8539 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -1275,7 +1275,7 @@ void __init key_init(void)
 {
 	/* allocate a slab in which we can store keys */
 	key_jar = kmem_cache_create("key_jar", sizeof(struct key),
-			0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+			0, SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_NO_MERGE, NULL);

 	/* add the special key types */
 	list_add_tail(&key_type_keyring.link, &key_types_list);
-- 
2.43.0

^ permalink raw reply related

* [PATCH v17 02/10] rust: types: Add Ownable/Owned types
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
  To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
	Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
	Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
	Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
	Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
	Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
	Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
	Liam R. Howlett, Vlastimil Babka
  Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
	dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
	Andreas Hindborg, driver-core, Asahi Lina, Oliver Mangold
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>

From: Asahi Lina <lina+kernel@asahilina.net>

By analogy to `AlwaysRefCounted` and `ARef`, an `Ownable` type is a
(typically C FFI) type that *may* be owned by Rust, but need not be. Unlike
`AlwaysRefCounted`, this mechanism expects the reference to be unique
within Rust, and does not allow cloning.

Conceptually, this is similar to a `KBox<T>`, except that it delegates
resource management to the `T` instead of using a generic allocator.

[ om:
  - Split code into separate file and `pub use` it from types.rs.
  - Make from_raw() and into_raw() public.
  - Remove OwnableMut, and make DerefMut dependent on Unpin instead.
  - Usage example/doctest for Ownable/Owned.
  - Fixes to documentation and commit message.
]

Link: https://lore.kernel.org/all/20250202-rust-page-v1-1-e3170d7fe55e@asahilina.net/
Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
Co-developed-by: Oliver Mangold <oliver.mangold@pm.me>
Signed-off-by: Oliver Mangold <oliver.mangold@pm.me>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
[ Andreas: Updated documentation, examples, and formatting. Change safety
  requirements, safety comments. Use a reference for `release`. ]
Reviewed-by: Gary Guo <gary@garyguo.net>
Co-developed-by: Andreas Hindborg <a.hindborg@kernel.org>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/lib.rs       |   1 +
 rust/kernel/owned.rs     | 187 +++++++++++++++++++++++++++++++++++++++++++++++
 rust/kernel/sync/aref.rs |   5 ++
 rust/kernel/types.rs     |  11 +++
 4 files changed, 204 insertions(+)

diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index b72b2fbe046d..d07759eec799 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -100,6 +100,7 @@
 pub mod of;
 #[cfg(CONFIG_PM_OPP)]
 pub mod opp;
+pub mod owned;
 pub mod page;
 #[cfg(CONFIG_PCI)]
 pub mod pci;
diff --git a/rust/kernel/owned.rs b/rust/kernel/owned.rs
new file mode 100644
index 000000000000..456e239e906e
--- /dev/null
+++ b/rust/kernel/owned.rs
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Unique owned pointer types for objects with custom drop logic.
+//!
+//! These pointer types are useful for C-allocated objects which by API-contract
+//! are owned by Rust, but need to be freed through the C API.
+
+use core::{
+    mem::ManuallyDrop,
+    ops::{
+        Deref,
+        DerefMut, //
+    },
+    pin::Pin,
+    ptr::NonNull, //
+};
+
+/// Types that specify their own way of performing allocation and destruction. Typically, this trait
+/// is implemented on types from the C side.
+///
+/// Implementing this trait allows types to be referenced via the [`Owned<Self>`] pointer type. This
+/// is useful when it is desirable to tie the lifetime of the reference to an owned object, rather
+/// than pass around a bare reference. [`Ownable`] types can define custom drop logic that is
+/// executed when the owned reference [`Owned<Self>`] pointing to the object is dropped.
+///
+/// Note: The underlying object is not required to provide internal reference counting, because it
+/// represents a unique, owned reference. If reference counting (on the Rust side) is required,
+/// [`AlwaysRefCounted`](crate::types::AlwaysRefCounted) should be implemented.
+///
+/// # Examples
+///
+/// A minimal example implementation of [`Ownable`] and its usage with [`Owned`] looks like
+/// this:
+///
+/// ```
+/// # #![expect(clippy::disallowed_names)]
+/// # use core::cell::Cell;
+/// # use core::ptr::NonNull;
+/// # use kernel::sync::global_lock;
+/// # use kernel::alloc::{flags, kbox::KBox, AllocError};
+/// # use kernel::types::{Owned, Ownable};
+///
+/// // Let's count the allocations to see if freeing works.
+/// kernel::sync::global_lock! {
+///     // SAFETY: we call `init()` right below, before doing anything else.
+///     unsafe(uninit) static FOO_ALLOC_COUNT: Mutex<usize> = 0;
+/// }
+/// // SAFETY: We call `init()` only once, here.
+/// unsafe { FOO_ALLOC_COUNT.init() };
+///
+/// struct Foo;
+///
+/// impl Foo {
+///     fn new() -> Result<Owned<Self>> {
+///         // We are just using a `KBox` here to handle the actual allocation, as our `Foo` is
+///         // not actually a C-allocated object.
+///         let result = KBox::new(
+///             Foo {},
+///             flags::GFP_KERNEL,
+///         )?;
+///         let result = KBox::into_non_null(result);
+///         // Count new allocation
+///         *FOO_ALLOC_COUNT.lock() += 1;
+///         // SAFETY:
+///         //  - We just allocated the `Self`, thus it is valid and we own it.
+///         //  - We can transfer this ownership to the `from_raw` method.
+///         Ok(unsafe { Owned::from_raw(result) })
+///     }
+/// }
+///
+/// impl Ownable for Foo {
+///     unsafe fn release(&mut self) {
+///         // SAFETY: The [`KBox<Self>`] is still alive. We can pass ownership to the [`KBox`], as
+///         // by requirement on calling this function.
+///         drop(unsafe { KBox::from_raw(self) });
+///         // Count released allocation
+///         *FOO_ALLOC_COUNT.lock() -= 1;
+///     }
+/// }
+///
+/// {
+///    let foo = Foo::new()?;
+///    assert!(*FOO_ALLOC_COUNT.lock() == 1);
+/// }
+/// // `foo` is out of scope now, so we expect no live allocations.
+/// assert!(*FOO_ALLOC_COUNT.lock() == 0);
+/// # Ok::<(), Error>(())
+/// ```
+pub trait Ownable {
+    /// Tear down this `Ownable`.
+    ///
+    /// Implementers of `Ownable` can use this function to clean up the use of `Self`. This can
+    /// include freeing the underlying object.
+    ///
+    /// # Safety
+    ///
+    /// Callers must ensure that the caller has exclusive ownership of `T`, and this ownership can
+    /// be transferred to the `release` method.
+    unsafe fn release(&mut self);
+}
+
+/// A mutable reference to an owned `T`.
+///
+/// The [`Ownable`] is automatically freed or released when an instance of [`Owned`] is
+/// dropped.
+///
+/// # Invariants
+///
+/// - Until `T::release` is called, this `Owned<T>` exclusively owns the underlying `T`.
+/// - The `T` value is pinned.
+pub struct Owned<T: Ownable> {
+    ptr: NonNull<T>,
+}
+
+impl<T: Ownable> Owned<T> {
+    /// Creates a new instance of [`Owned`].
+    ///
+    /// This function takes over ownership of the underlying object.
+    ///
+    /// # Safety
+    ///
+    /// Callers must ensure that:
+    /// - `ptr` points to a valid instance of `T`.
+    /// - Until `T::release` is called, the returned `Owned<T>` exclusively owns the underlying `T`.
+    #[inline]
+    pub unsafe fn from_raw(ptr: NonNull<T>) -> Self {
+        // INVARIANT: By function safety requirement we satisfy the first invariant of `Self`.
+        // We treat `T` as pinned from now on.
+        Self { ptr }
+    }
+
+    /// Consumes the [`Owned`], returning a raw pointer.
+    ///
+    /// This function does not drop the underlying `T`. When this function returns, ownership of the
+    /// underlying `T` is with the caller.
+    #[inline]
+    pub fn into_raw(me: Self) -> NonNull<T> {
+        ManuallyDrop::new(me).ptr
+    }
+
+    /// Get a pinned mutable reference to the data owned by this `Owned<T>`.
+    #[inline]
+    pub fn as_pin_mut(&mut self) -> Pin<&mut T> {
+        // SAFETY: The type invariants guarantee that the object is valid, and that we can safely
+        // return a mutable reference to it.
+        let unpinned = unsafe { self.ptr.as_mut() };
+
+        // SAFETY: By type invariant `T` is pinned.
+        unsafe { Pin::new_unchecked(unpinned) }
+    }
+}
+
+// SAFETY: It is safe to send an [`Owned<T>`] to another thread when the underlying `T` is [`Send`],
+// because of the ownership invariant. Sending an [`Owned<T>`] is equivalent to sending the `T`.
+unsafe impl<T: Ownable + Send> Send for Owned<T> {}
+
+// SAFETY: It is safe to send [`&Owned<T>`] to another thread when the underlying `T` is [`Sync`],
+// because of the ownership invariant. Sending an [`&Owned<T>`] is equivalent to sending the `&T`.
+unsafe impl<T: Ownable + Sync> Sync for Owned<T> {}
+
+impl<T: Ownable> Deref for Owned<T> {
+    type Target = T;
+
+    #[inline]
+    fn deref(&self) -> &Self::Target {
+        // SAFETY: The type invariants guarantee that the object is valid.
+        unsafe { self.ptr.as_ref() }
+    }
+}
+
+impl<T: Ownable + Unpin> DerefMut for Owned<T> {
+    #[inline]
+    fn deref_mut(&mut self) -> &mut Self::Target {
+        // SAFETY: The type invariants guarantee that the object is valid, and that we can safely
+        // return a mutable reference to it.
+        unsafe { self.ptr.as_mut() }
+    }
+}
+
+impl<T: Ownable> Drop for Owned<T> {
+    #[inline]
+    fn drop(&mut self) {
+        // SAFETY: By existence of `&mut self` we exclusively own `self` and the underlying `T`. As
+        // we are dropping `self`, we can transfer ownership of the `T` to the `release` method.
+        unsafe { T::release(self.ptr.as_mut()) };
+    }
+}
diff --git a/rust/kernel/sync/aref.rs b/rust/kernel/sync/aref.rs
index 9989f56d0605..4ee5fac0e0b6 100644
--- a/rust/kernel/sync/aref.rs
+++ b/rust/kernel/sync/aref.rs
@@ -29,6 +29,11 @@
 /// Rust code, the recommendation is to use [`Arc`](crate::sync::Arc) to create reference-counted
 /// instances of a type.
 ///
+/// Note: Implementing this trait allows types to be wrapped in an [`ARef<Self>`]. It requires an
+/// internal reference count and provides only shared references. If unique references are required
+/// [`Ownable`](crate::types::Ownable) should be implemented which allows types to be wrapped in an
+/// [`Owned<Self>`](crate::types::Owned).
+///
 /// # Safety
 ///
 /// Implementers must ensure that increments to the reference count keep the object alive in memory
diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
index 4329d3c2c2e5..4aec7b699269 100644
--- a/rust/kernel/types.rs
+++ b/rust/kernel/types.rs
@@ -11,6 +11,17 @@
 };
 use pin_init::{PinInit, Wrapper, Zeroable};
 
+pub use crate::{
+    owned::{
+        Ownable,
+        Owned, //
+    },
+    sync::aref::{
+        ARef,
+        AlwaysRefCounted, //
+    }, //
+};
+
 /// Used to transfer ownership to and from foreign (non-Rust) languages.
 ///
 /// Ownership is transferred from Rust to a foreign language by calling [`Self::into_foreign`] and

-- 
2.51.2



^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox