Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* [PATCH 0/3] Add support for ML-DSA signature for EVM and IMA
From: Stefan Berger @ 2026-04-05 23:12 UTC (permalink / raw)
  To: linux-integrity, linux-security-module
  Cc: linux-kernel, zohar, roberto.sassu, ebiggers, Stefan Berger

Based on IMA sigv3 type of signatures, add support for ML-DSA signature
for EVM and IMA. Use the existing ML-DSA hashless signing mode (pure mode).

   Stefan

Stefan Berger (3):
  crypto: public_key: Remove check for valid hash_algo for ML-DSA keys
  integrity: Refactor asymmetric_verify for reusability
  integrity: Add support for sigv3 verification using ML-DSA keys

 crypto/asymmetric_keys/public_key.c    |   5 -
 security/integrity/digsig_asymmetric.c | 126 +++++++++++++++++++++----
 2 files changed, 107 insertions(+), 24 deletions(-)


base-commit: 82bbd447199ff1441031d2eaf9afe041550cf525
-- 
2.53.0


^ permalink raw reply

* Re: [PATCH v4 2/3] lsm: add backing_file LSM hooks
From: Serge E. Hallyn @ 2026-04-05  3:12 UTC (permalink / raw)
  To: Paul Moore
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Amir Goldstein, Gao Xiang, Christian Brauner
In-Reply-To: <20260403030848.731867-7-paul@paul-moore.com>

On Thu, Apr 02, 2026 at 11:08:34PM -0400, Paul Moore wrote:
> Stacked filesystems such as overlayfs do not currently provide the
> necessary mechanisms for LSMs to properly enforce access controls on the
> mmap() and mprotect() operations.  In order to resolve this gap, a LSM
> security blob is being added to the backing_file struct and the following
> new LSM hooks are being created:
> 
>  security_backing_file_alloc()
>  security_backing_file_free()
>  security_mmap_backing_file()
> 
> The first two hooks are to manage the lifecycle of the LSM security blob
> in the backing_file struct, while the third provides a new mmap() access
> control point for the underlying backing file.  It is also expected that
> LSMs will likely want to update their security_file_mprotect() callback
> to address issues with their mprotect() controls, but that does not
> require a change to the security_file_mprotect() LSM hook.
> 
> There are a three other small changes to support these new LSM hooks:
> * Pass the user file associated with a backing file down to
> alloc_empty_backing_file() so it can be included in the
> security_backing_file_alloc() hook.
> * Add getter and setter functions for the backing_file struct LSM blob
> as the backing_file struct remains private to fs/file_table.c.
> * Constify the file struct field in the LSM common_audit_data struct to
> better support LSMs that need to pass a const file struct pointer into
> the common LSM audit code.
> 
> Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL()
> and supplying a fixup.
> 
> Cc: stable@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-erofs@lists.ozlabs.org
> Signed-off-by: Paul Moore <paul@paul-moore.com>

Reviewed-by: Serge Hallyn <serge@hallyn.com>

> ---
>  fs/backing-file.c             |  18 ++++--
>  fs/erofs/ishare.c             |  10 +++-
>  fs/file_table.c               |  27 +++++++--
>  fs/fuse/passthrough.c         |   2 +-
>  fs/internal.h                 |   3 +-
>  fs/overlayfs/dir.c            |   2 +-
>  fs/overlayfs/file.c           |   2 +-
>  include/linux/backing-file.h  |   4 +-
>  include/linux/fs.h            |  13 +++++
>  include/linux/lsm_audit.h     |   2 +-
>  include/linux/lsm_hook_defs.h |   5 ++
>  include/linux/lsm_hooks.h     |   1 +
>  include/linux/security.h      |  22 ++++++++
>  security/lsm.h                |   1 +
>  security/lsm_init.c           |   9 +++
>  security/security.c           | 102 ++++++++++++++++++++++++++++++++++
>  16 files changed, 206 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/backing-file.c b/fs/backing-file.c
> index 45da8600d564..1f3bbfc75882 100644
> --- a/fs/backing-file.c
> +++ b/fs/backing-file.c
> @@ -12,6 +12,7 @@
>  #include <linux/backing-file.h>
>  #include <linux/splice.h>
>  #include <linux/mm.h>
> +#include <linux/security.h>
>  
>  #include "internal.h"
>  
> @@ -29,14 +30,15 @@
>   * returned file into a container structure that also stores the stacked
>   * file's path, which can be retrieved using backing_file_user_path().
>   */
> -struct file *backing_file_open(const struct path *user_path, int flags,
> +struct file *backing_file_open(const struct file *user_file, int flags,
>  			       const struct path *real_path,
>  			       const struct cred *cred)
>  {
> +	const struct path *user_path = &user_file->f_path;
>  	struct file *f;
>  	int error;
>  
> -	f = alloc_empty_backing_file(flags, cred);
> +	f = alloc_empty_backing_file(flags, cred, user_file);
>  	if (IS_ERR(f))
>  		return f;
>  
> @@ -52,15 +54,16 @@ struct file *backing_file_open(const struct path *user_path, int flags,
>  }
>  EXPORT_SYMBOL_GPL(backing_file_open);
>  
> -struct file *backing_tmpfile_open(const struct path *user_path, int flags,
> +struct file *backing_tmpfile_open(const struct file *user_file, int flags,
>  				  const struct path *real_parentpath,
>  				  umode_t mode, const struct cred *cred)
>  {
>  	struct mnt_idmap *real_idmap = mnt_idmap(real_parentpath->mnt);
> +	const struct path *user_path = &user_file->f_path;
>  	struct file *f;
>  	int error;
>  
> -	f = alloc_empty_backing_file(flags, cred);
> +	f = alloc_empty_backing_file(flags, cred, user_file);
>  	if (IS_ERR(f))
>  		return f;
>  
> @@ -336,8 +339,13 @@ int backing_file_mmap(struct file *file, struct vm_area_struct *vma,
>  
>  	vma_set_file(vma, file);
>  
> -	scoped_with_creds(ctx->cred)
> +	scoped_with_creds(ctx->cred) {
> +		ret = security_mmap_backing_file(vma, file, user_file);
> +		if (ret)
> +			return ret;
> +
>  		ret = vfs_mmap(vma->vm_file, vma);
> +	}
>  
>  	if (ctx->accessed)
>  		ctx->accessed(user_file);
> diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
> index ec433bacc592..6ed66b17359b 100644
> --- a/fs/erofs/ishare.c
> +++ b/fs/erofs/ishare.c
> @@ -4,6 +4,7 @@
>   */
>  #include <linux/xxhash.h>
>  #include <linux/mount.h>
> +#include <linux/security.h>
>  #include "internal.h"
>  #include "xattr.h"
>  
> @@ -106,7 +107,8 @@ static int erofs_ishare_file_open(struct inode *inode, struct file *file)
>  
>  	if (file->f_flags & O_DIRECT)
>  		return -EINVAL;
> -	realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred());
> +	realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred(),
> +					    file);
>  	if (IS_ERR(realfile))
>  		return PTR_ERR(realfile);
>  	ihold(sharedinode);
> @@ -150,8 +152,14 @@ static ssize_t erofs_ishare_file_read_iter(struct kiocb *iocb,
>  static int erofs_ishare_mmap(struct file *file, struct vm_area_struct *vma)
>  {
>  	struct file *realfile = file->private_data;
> +	int err;
>  
>  	vma_set_file(vma, realfile);
> +
> +	err = security_mmap_backing_file(vma, realfile, file);
> +	if (err)
> +		return err;
> +
>  	return generic_file_readonly_mmap(file, vma);
>  }
>  
> diff --git a/fs/file_table.c b/fs/file_table.c
> index 3b3792903185..d19d879b6efc 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -50,6 +50,9 @@ struct backing_file {
>  		struct path user_path;
>  		freeptr_t bf_freeptr;
>  	};
> +#ifdef CONFIG_SECURITY
> +	void *security;
> +#endif
>  };
>  
>  #define backing_file(f) container_of(f, struct backing_file, file)
> @@ -66,8 +69,21 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
>  }
>  EXPORT_SYMBOL_GPL(backing_file_set_user_path);
>  
> +#ifdef CONFIG_SECURITY
> +void *backing_file_security(const struct file *f)
> +{
> +	return backing_file(f)->security;
> +}
> +
> +void backing_file_set_security(struct file *f, void *security)
> +{
> +	backing_file(f)->security = security;
> +}
> +#endif /* CONFIG_SECURITY */
> +
>  static inline void backing_file_free(struct backing_file *ff)
>  {
> +	security_backing_file_free(&ff->file);
>  	path_put(&ff->user_path);
>  	kmem_cache_free(bfilp_cachep, ff);
>  }
> @@ -288,10 +304,12 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
>  	return f;
>  }
>  
> -static int init_backing_file(struct backing_file *ff)
> +static int init_backing_file(struct backing_file *ff,
> +			     const struct file *user_file)
>  {
>  	memset(&ff->user_path, 0, sizeof(ff->user_path));
> -	return 0;
> +	backing_file_set_security(&ff->file, NULL);
> +	return security_backing_file_alloc(&ff->file, user_file);
>  }
>  
>  /*
> @@ -301,7 +319,8 @@ static int init_backing_file(struct backing_file *ff)
>   * This is only for kernel internal use, and the allocate file must not be
>   * installed into file tables or such.
>   */
> -struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
> +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> +				      const struct file *user_file)
>  {
>  	struct backing_file *ff;
>  	int error;
> @@ -318,7 +337,7 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
>  
>  	/* The f_mode flags must be set before fput(). */
>  	ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
> -	error = init_backing_file(ff);
> +	error = init_backing_file(ff, user_file);
>  	if (unlikely(error)) {
>  		fput(&ff->file);
>  		return ERR_PTR(error);
> diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c
> index 72de97c03d0e..f2d08ac2459b 100644
> --- a/fs/fuse/passthrough.c
> +++ b/fs/fuse/passthrough.c
> @@ -167,7 +167,7 @@ struct fuse_backing *fuse_passthrough_open(struct file *file, int backing_id)
>  		goto out;
>  
>  	/* Allocate backing file per fuse file to store fuse path */
> -	backing_file = backing_file_open(&file->f_path, file->f_flags,
> +	backing_file = backing_file_open(file, file->f_flags,
>  					 &fb->file->f_path, fb->cred);
>  	err = PTR_ERR(backing_file);
>  	if (IS_ERR(backing_file)) {
> diff --git a/fs/internal.h b/fs/internal.h
> index cbc384a1aa09..77e90e4124e0 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -106,7 +106,8 @@ extern void chroot_fs_refs(const struct path *, const struct path *);
>   */
>  struct file *alloc_empty_file(int flags, const struct cred *cred);
>  struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred);
> -struct file *alloc_empty_backing_file(int flags, const struct cred *cred);
> +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> +				      const struct file *user_file);
>  void backing_file_set_user_path(struct file *f, const struct path *path);
>  
>  static inline void file_put_write_access(struct file *file)
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index ff3dbd1ca61f..f2f20a611af3 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -1374,7 +1374,7 @@ static int ovl_create_tmpfile(struct file *file, struct dentry *dentry,
>  				return PTR_ERR(cred);
>  
>  			ovl_path_upper(dentry->d_parent, &realparentpath);
> -			realfile = backing_tmpfile_open(&file->f_path, flags, &realparentpath,
> +			realfile = backing_tmpfile_open(file, flags, &realparentpath,
>  							mode, current_cred());
>  			err = PTR_ERR_OR_ZERO(realfile);
>  			pr_debug("tmpfile/open(%pd2, 0%o) = %i\n", realparentpath.dentry, mode, err);
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 97bed2286030..27cc07738f33 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -48,7 +48,7 @@ static struct file *ovl_open_realfile(const struct file *file,
>  			if (!inode_owner_or_capable(real_idmap, realinode))
>  				flags &= ~O_NOATIME;
>  
> -			realfile = backing_file_open(file_user_path(file),
> +			realfile = backing_file_open(file,
>  						     flags, realpath, current_cred());
>  		}
>  	}
> diff --git a/include/linux/backing-file.h b/include/linux/backing-file.h
> index 1476a6ed1bfd..c939cd222730 100644
> --- a/include/linux/backing-file.h
> +++ b/include/linux/backing-file.h
> @@ -18,10 +18,10 @@ struct backing_file_ctx {
>  	void (*end_write)(struct kiocb *iocb, ssize_t);
>  };
>  
> -struct file *backing_file_open(const struct path *user_path, int flags,
> +struct file *backing_file_open(const struct file *user_file, int flags,
>  			       const struct path *real_path,
>  			       const struct cred *cred);
> -struct file *backing_tmpfile_open(const struct path *user_path, int flags,
> +struct file *backing_tmpfile_open(const struct file *user_file, int flags,
>  				  const struct path *real_parentpath,
>  				  umode_t mode, const struct cred *cred);
>  ssize_t backing_file_read_iter(struct file *file, struct iov_iter *iter,
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 8b3dd145b25e..d0d0e8f55589 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2475,6 +2475,19 @@ struct file *dentry_create(struct path *path, int flags, umode_t mode,
>  			   const struct cred *cred);
>  const struct path *backing_file_user_path(const struct file *f);
>  
> +#ifdef CONFIG_SECURITY
> +void *backing_file_security(const struct file *f);
> +void backing_file_set_security(struct file *f, void *security);
> +#else
> +static inline void *backing_file_security(const struct file *f)
> +{
> +	return NULL;
> +}
> +static inline void backing_file_set_security(struct file *f, void *security)
> +{
> +}
> +#endif /* CONFIG_SECURITY */
> +
>  /*
>   * When mmapping a file on a stackable filesystem (e.g., overlayfs), the file
>   * stored in ->vm_file is a backing file whose f_inode is on the underlying
> diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
> index 382c56a97bba..584db296e43b 100644
> --- a/include/linux/lsm_audit.h
> +++ b/include/linux/lsm_audit.h
> @@ -94,7 +94,7 @@ struct common_audit_data {
>  #endif
>  		char *kmod_name;
>  		struct lsm_ioctlop_audit *op;
> -		struct file *file;
> +		const struct file *file;
>  		struct lsm_ibpkey_audit *ibpkey;
>  		struct lsm_ibendport_audit *ibendport;
>  		int reason;
> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> index 8c42b4bde09c..b4958167e381 100644
> --- a/include/linux/lsm_hook_defs.h
> +++ b/include/linux/lsm_hook_defs.h
> @@ -191,6 +191,9 @@ LSM_HOOK(int, 0, file_permission, struct file *file, int mask)
>  LSM_HOOK(int, 0, file_alloc_security, struct file *file)
>  LSM_HOOK(void, LSM_RET_VOID, file_release, struct file *file)
>  LSM_HOOK(void, LSM_RET_VOID, file_free_security, struct file *file)
> +LSM_HOOK(int, 0, backing_file_alloc, struct file *backing_file,
> +	 const struct file *user_file)
> +LSM_HOOK(void, LSM_RET_VOID, backing_file_free, struct file *backing_file)
>  LSM_HOOK(int, 0, file_ioctl, struct file *file, unsigned int cmd,
>  	 unsigned long arg)
>  LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
> @@ -198,6 +201,8 @@ LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
>  LSM_HOOK(int, 0, mmap_addr, unsigned long addr)
>  LSM_HOOK(int, 0, mmap_file, struct file *file, unsigned long reqprot,
>  	 unsigned long prot, unsigned long flags)
> +LSM_HOOK(int, 0, mmap_backing_file, struct vm_area_struct *vma,
> +	 struct file *backing_file, struct file *user_file)
>  LSM_HOOK(int, 0, file_mprotect, struct vm_area_struct *vma,
>  	 unsigned long reqprot, unsigned long prot)
>  LSM_HOOK(int, 0, file_lock, struct file *file, unsigned int cmd)
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index d48bf0ad26f4..b4f8cad53ddb 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -104,6 +104,7 @@ struct security_hook_list {
>  struct lsm_blob_sizes {
>  	unsigned int lbs_cred;
>  	unsigned int lbs_file;
> +	unsigned int lbs_backing_file;
>  	unsigned int lbs_ib;
>  	unsigned int lbs_inode;
>  	unsigned int lbs_sock;
> diff --git a/include/linux/security.h b/include/linux/security.h
> index ee88dd2d2d1f..8d2d4856934e 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -472,11 +472,17 @@ int security_file_permission(struct file *file, int mask);
>  int security_file_alloc(struct file *file);
>  void security_file_release(struct file *file);
>  void security_file_free(struct file *file);
> +int security_backing_file_alloc(struct file *backing_file,
> +				const struct file *user_file);
> +void security_backing_file_free(struct file *backing_file);
>  int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
>  int security_file_ioctl_compat(struct file *file, unsigned int cmd,
>  			       unsigned long arg);
>  int security_mmap_file(struct file *file, unsigned long prot,
>  			unsigned long flags);
> +int security_mmap_backing_file(struct vm_area_struct *vma,
> +			       struct file *backing_file,
> +			       struct file *user_file);
>  int security_mmap_addr(unsigned long addr);
>  int security_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
>  			   unsigned long prot);
> @@ -1141,6 +1147,15 @@ static inline void security_file_release(struct file *file)
>  static inline void security_file_free(struct file *file)
>  { }
>  
> +static inline int security_backing_file_alloc(struct file *backing_file,
> +					      const struct file *user_file)
> +{
> +	return 0;
> +}
> +
> +static inline void security_backing_file_free(struct file *backing_file)
> +{ }
> +
>  static inline int security_file_ioctl(struct file *file, unsigned int cmd,
>  				      unsigned long arg)
>  {
> @@ -1160,6 +1175,13 @@ static inline int security_mmap_file(struct file *file, unsigned long prot,
>  	return 0;
>  }
>  
> +static inline int security_mmap_backing_file(struct vm_area_struct *vma,
> +					     struct file *backing_file,
> +					     struct file *user_file)
> +{
> +	return 0;
> +}
> +
>  static inline int security_mmap_addr(unsigned long addr)
>  {
>  	return cap_mmap_addr(addr);
> diff --git a/security/lsm.h b/security/lsm.h
> index db77cc83e158..32f808ad4335 100644
> --- a/security/lsm.h
> +++ b/security/lsm.h
> @@ -29,6 +29,7 @@ extern struct lsm_blob_sizes blob_sizes;
>  
>  /* LSM blob caches */
>  extern struct kmem_cache *lsm_file_cache;
> +extern struct kmem_cache *lsm_backing_file_cache;
>  extern struct kmem_cache *lsm_inode_cache;
>  
>  /* LSM blob allocators */
> diff --git a/security/lsm_init.c b/security/lsm_init.c
> index 573e2a7250c4..7c0fd17f1601 100644
> --- a/security/lsm_init.c
> +++ b/security/lsm_init.c
> @@ -293,6 +293,8 @@ static void __init lsm_prepare(struct lsm_info *lsm)
>  	blobs = lsm->blobs;
>  	lsm_blob_size_update(&blobs->lbs_cred, &blob_sizes.lbs_cred);
>  	lsm_blob_size_update(&blobs->lbs_file, &blob_sizes.lbs_file);
> +	lsm_blob_size_update(&blobs->lbs_backing_file,
> +			     &blob_sizes.lbs_backing_file);
>  	lsm_blob_size_update(&blobs->lbs_ib, &blob_sizes.lbs_ib);
>  	/* inode blob gets an rcu_head in addition to LSM blobs. */
>  	if (blobs->lbs_inode && blob_sizes.lbs_inode == 0)
> @@ -441,6 +443,8 @@ int __init security_init(void)
>  	if (lsm_debug) {
>  		lsm_pr("blob(cred) size %d\n", blob_sizes.lbs_cred);
>  		lsm_pr("blob(file) size %d\n", blob_sizes.lbs_file);
> +		lsm_pr("blob(backing_file) size %d\n",
> +		       blob_sizes.lbs_backing_file);
>  		lsm_pr("blob(ib) size %d\n", blob_sizes.lbs_ib);
>  		lsm_pr("blob(inode) size %d\n", blob_sizes.lbs_inode);
>  		lsm_pr("blob(ipc) size %d\n", blob_sizes.lbs_ipc);
> @@ -462,6 +466,11 @@ int __init security_init(void)
>  		lsm_file_cache = kmem_cache_create("lsm_file_cache",
>  						   blob_sizes.lbs_file, 0,
>  						   SLAB_PANIC, NULL);
> +	if (blob_sizes.lbs_backing_file)
> +		lsm_backing_file_cache = kmem_cache_create(
> +						   "lsm_backing_file_cache",
> +						   blob_sizes.lbs_backing_file,
> +						   0, SLAB_PANIC, NULL);
>  	if (blob_sizes.lbs_inode)
>  		lsm_inode_cache = kmem_cache_create("lsm_inode_cache",
>  						    blob_sizes.lbs_inode, 0,
> diff --git a/security/security.c b/security/security.c
> index a26c1474e2e4..048560ef6a1a 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -82,6 +82,7 @@ const struct lsm_id *lsm_idlist[MAX_LSM_COUNT];
>  struct lsm_blob_sizes blob_sizes;
>  
>  struct kmem_cache *lsm_file_cache;
> +struct kmem_cache *lsm_backing_file_cache;
>  struct kmem_cache *lsm_inode_cache;
>  
>  #define SECURITY_HOOK_ACTIVE_KEY(HOOK, IDX) security_hook_active_##HOOK##_##IDX
> @@ -173,6 +174,30 @@ static int lsm_file_alloc(struct file *file)
>  	return 0;
>  }
>  
> +/**
> + * lsm_backing_file_alloc - allocate a composite backing file blob
> + * @backing_file: the backing file
> + *
> + * Allocate the backing file blob for all the modules.
> + *
> + * Returns 0, or -ENOMEM if memory can't be allocated.
> + */
> +static int lsm_backing_file_alloc(struct file *backing_file)
> +{
> +	void *blob;
> +
> +	if (!lsm_backing_file_cache) {
> +		backing_file_set_security(backing_file, NULL);
> +		return 0;
> +	}
> +
> +	blob = kmem_cache_zalloc(lsm_backing_file_cache, GFP_KERNEL);
> +	backing_file_set_security(backing_file, blob);
> +	if (!blob)
> +		return -ENOMEM;
> +	return 0;
> +}
> +
>  /**
>   * lsm_blob_alloc - allocate a composite blob
>   * @dest: the destination for the blob
> @@ -2418,6 +2443,57 @@ void security_file_free(struct file *file)
>  	}
>  }
>  
> +/**
> + * security_backing_file_alloc() - Allocate and setup a backing file blob
> + * @backing_file: the backing file
> + * @user_file: the associated user visible file
> + *
> + * Allocate a backing file LSM blob and perform any necessary initialization of
> + * the LSM blob.  There will be some operations where the LSM will not have
> + * access to @user_file after this point, so any important state associated
> + * with @user_file that is important to the LSM should be captured in the
> + * backing file's LSM blob.
> + *
> + * LSM's should avoid taking a reference to @user_file in this hook as it will
> + * result in problems later when the system attempts to drop/put the file
> + * references due to a circular dependency.
> + *
> + * Return: Return 0 if the hook is successful, negative values otherwise.
> + */
> +int security_backing_file_alloc(struct file *backing_file,
> +				const struct file *user_file)
> +{
> +	int rc;
> +
> +	rc = lsm_backing_file_alloc(backing_file);
> +	if (rc)
> +		return rc;
> +	rc = call_int_hook(backing_file_alloc, backing_file, user_file);
> +	if (unlikely(rc))
> +		security_backing_file_free(backing_file);
> +
> +	return rc;
> +}
> +
> +/**
> + * security_backing_file_free() - Free a backing file blob
> + * @backing_file: the backing file
> + *
> + * Free any LSM state associate with a backing file's LSM blob, including the
> + * blob itself.
> + */
> +void security_backing_file_free(struct file *backing_file)
> +{
> +	void *blob = backing_file_security(backing_file);
> +
> +	call_void_hook(backing_file_free, backing_file);
> +
> +	if (blob) {
> +		backing_file_set_security(backing_file, NULL);
> +		kmem_cache_free(lsm_backing_file_cache, blob);
> +	}
> +}
> +
>  /**
>   * security_file_ioctl() - Check if an ioctl is allowed
>   * @file: associated file
> @@ -2506,6 +2582,32 @@ int security_mmap_file(struct file *file, unsigned long prot,
>  			     flags);
>  }
>  
> +/**
> + * security_mmap_backing_file - Check if mmap'ing a backing file is allowed
> + * @vma: the vm_area_struct for the mmap'd region
> + * @backing_file: the backing file being mmap'd
> + * @user_file: the user file being mmap'd
> + *
> + * Check permissions for a mmap operation on a stacked filesystem.  This hook
> + * is called after the security_mmap_file() and is responsible for authorizing
> + * the mmap on @backing_file.  It is important to note that the mmap operation
> + * on @user_file has already been authorized and the @vma->vm_file has been
> + * set to @backing_file.
> + *
> + * Return: Returns 0 if permission is granted.
> + */
> +int security_mmap_backing_file(struct vm_area_struct *vma,
> +			       struct file *backing_file,
> +			       struct file *user_file)
> +{
> +	/* recommended by the stackable filesystem devs */
> +	if (WARN_ON_ONCE(!(backing_file->f_mode & FMODE_BACKING)))
> +		return -EIO;
> +
> +	return call_int_hook(mmap_backing_file, vma, backing_file, user_file);
> +}
> +EXPORT_SYMBOL_GPL(security_mmap_backing_file);
> +
>  /**
>   * security_mmap_addr() - Check if mmap'ing an address is allowed
>   * @addr: address
> -- 
> 2.53.0
> 

^ permalink raw reply

* Re: [PATCH v4 1/3] fs: prepare for adding LSM blob to backing_file
From: Serge E. Hallyn @ 2026-04-05  0:14 UTC (permalink / raw)
  To: Paul Moore
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Amir Goldstein, Gao Xiang, Christian Brauner
In-Reply-To: <20260403030848.731867-6-paul@paul-moore.com>

On Thu, Apr 02, 2026 at 11:08:33PM -0400, Paul Moore wrote:
> From: Amir Goldstein <amir73il@gmail.com>
> 
> In preparation to adding LSM blob to backing_file struct, factor out
> helpers init_backing_file() and backing_file_free().
> 
> Cc: stable@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-erofs@lists.ozlabs.org
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> [PM: use the term "LSM blob", fix comment style to match file]
> Signed-off-by: Paul Moore <paul@paul-moore.com>

Reviewed-by: Serge Hallyn <serge@hallyn.com>

> ---
>  fs/file_table.c | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/file_table.c b/fs/file_table.c
> index aaa5faaace1e..3b3792903185 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -66,6 +66,12 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
>  }
>  EXPORT_SYMBOL_GPL(backing_file_set_user_path);
>  
> +static inline void backing_file_free(struct backing_file *ff)
> +{
> +	path_put(&ff->user_path);
> +	kmem_cache_free(bfilp_cachep, ff);
> +}
> +
>  static inline void file_free(struct file *f)
>  {
>  	security_file_free(f);
> @@ -73,8 +79,7 @@ static inline void file_free(struct file *f)
>  		percpu_counter_dec(&nr_files);
>  	put_cred(f->f_cred);
>  	if (unlikely(f->f_mode & FMODE_BACKING)) {
> -		path_put(backing_file_user_path(f));
> -		kmem_cache_free(bfilp_cachep, backing_file(f));
> +		backing_file_free(backing_file(f));
>  	} else {
>  		kmem_cache_free(filp_cachep, f);
>  	}
> @@ -283,6 +288,12 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
>  	return f;
>  }
>  
> +static int init_backing_file(struct backing_file *ff)
> +{
> +	memset(&ff->user_path, 0, sizeof(ff->user_path));
> +	return 0;
> +}
> +
>  /*
>   * Variant of alloc_empty_file() that allocates a backing_file container
>   * and doesn't check and modify nr_files.
> @@ -305,7 +316,14 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
>  		return ERR_PTR(error);
>  	}
>  
> +	/* The f_mode flags must be set before fput(). */
>  	ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
> +	error = init_backing_file(ff);
> +	if (unlikely(error)) {
> +		fput(&ff->file);
> +		return ERR_PTR(error);
> +	}
> +
>  	return &ff->file;
>  }
>  EXPORT_SYMBOL_GPL(alloc_empty_backing_file);
> -- 
> 2.53.0
> 

^ permalink raw reply

* [PATCH v1 2/2] landlock: Allow TSYNC with LOG_SUBDOMAINS_OFF and fd=-1
From: Mickaël Salaün @ 2026-04-04  8:49 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, stable
In-Reply-To: <20260404085001.1604405-1-mic@digikod.net>

LANDLOCK_RESTRICT_SELF_TSYNC does not allow
LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF with ruleset_fd=-1, preventing
a multithreaded process from atomically propagating subdomain log muting
to all threads without creating a domain layer.  Relax the fd=-1
condition to accept TSYNC alongside LOG_SUBDOMAINS_OFF, and update the
documentation accordingly.

Add flag validation tests for all TSYNC combinations with ruleset_fd=-1,
and audit tests verifying both transition directions: muting via TSYNC
(logged to not logged) and override via TSYNC (not logged to logged).

Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Fixes: 42fc7e6543f6 ("landlock: Multithreading support for landlock_restrict_self()")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---
 include/uapi/linux/landlock.h                 |   4 +-
 security/landlock/syscalls.c                  |  14 +-
 tools/testing/selftests/landlock/audit_test.c | 233 ++++++++++++++++++
 tools/testing/selftests/landlock/tsync_test.c |  74 ++++++
 4 files changed, 319 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index f88fa1f68b77..d37603efc273 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -116,7 +116,9 @@ struct landlock_ruleset_attr {
  *     ``LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF``, this flag only affects
  *     future nested domains, not the one being created. It can also be used
  *     with a @ruleset_fd value of -1 to mute subdomain logs without creating a
- *     domain.
+ *     domain.  When combined with %LANDLOCK_RESTRICT_SELF_TSYNC and a
+ *     @ruleset_fd value of -1, this configuration is propagated to all threads
+ *     of the current process.
  *
  * The following flag supports policy enforcement in multithreaded processes:
  *
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 0d66a68677b7..a0bb664e0d31 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -512,10 +512,13 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 
 	/*
 	 * It is allowed to set LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF with
-	 * -1 as ruleset_fd, but no other flag must be set.
+	 * -1 as ruleset_fd, optionally combined with
+	 * LANDLOCK_RESTRICT_SELF_TSYNC to propagate this configuration to all
+	 * threads.  No other flag must be set.
 	 */
 	if (!(ruleset_fd == -1 &&
-	      flags == LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF)) {
+	      (flags & ~LANDLOCK_RESTRICT_SELF_TSYNC) ==
+		      LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF)) {
 		/* Gets and checks the ruleset. */
 		ruleset = get_ruleset_from_fd(ruleset_fd, FMODE_CAN_READ);
 		if (IS_ERR(ruleset))
@@ -537,9 +540,10 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 
 	/*
 	 * The only case when a ruleset may not be set is if
-	 * LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF is set and ruleset_fd is -1.
-	 * We could optimize this case by not calling commit_creds() if this flag
-	 * was already set, but it is not worth the complexity.
+	 * LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF is set (optionally with
+	 * LANDLOCK_RESTRICT_SELF_TSYNC) and ruleset_fd is -1.  We could
+	 * optimize this case by not calling commit_creds() if this flag was
+	 * already set, but it is not worth the complexity.
 	 */
 	if (ruleset) {
 		/*
diff --git a/tools/testing/selftests/landlock/audit_test.c b/tools/testing/selftests/landlock/audit_test.c
index 20099b8667e7..a193d8a97560 100644
--- a/tools/testing/selftests/landlock/audit_test.c
+++ b/tools/testing/selftests/landlock/audit_test.c
@@ -162,6 +162,7 @@ TEST_F(audit, layers)
 struct thread_data {
 	pid_t parent_pid;
 	int ruleset_fd, pipe_child, pipe_parent;
+	bool mute_subdomains;
 };
 
 static void *thread_audit_test(void *arg)
@@ -367,6 +368,238 @@ TEST_F(audit, log_subdomains_off_fork)
 	EXPECT_EQ(0, close(ruleset_fd));
 }
 
+/*
+ * Thread function: runs two rounds of (create domain, trigger denial, signal
+ * back), waiting for the main thread before each round.  When mute_subdomains
+ * is set, phase 1 also mutes subdomain logs via the fd=-1 path before creating
+ * the domain.  The ruleset_fd is kept open across both rounds so each
+ * restrict_self call stacks a new domain layer.
+ */
+static void *thread_sandbox_deny_twice(void *arg)
+{
+	const struct thread_data *data = (struct thread_data *)arg;
+	uintptr_t err = 0;
+	char buffer;
+
+	/* Phase 1: optionally mutes, creates a domain, and triggers a denial. */
+	if (read(data->pipe_parent, &buffer, 1) != 1) {
+		err = 1;
+		goto out;
+	}
+
+	if (data->mute_subdomains &&
+	    landlock_restrict_self(-1,
+				   LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF)) {
+		err = 2;
+		goto out;
+	}
+
+	if (landlock_restrict_self(data->ruleset_fd, 0)) {
+		err = 3;
+		goto out;
+	}
+
+	if (kill(data->parent_pid, 0) != -1 || errno != EPERM) {
+		err = 4;
+		goto out;
+	}
+
+	if (write(data->pipe_child, ".", 1) != 1) {
+		err = 5;
+		goto out;
+	}
+
+	/* Phase 2: stacks another domain and triggers a denial. */
+	if (read(data->pipe_parent, &buffer, 1) != 1) {
+		err = 6;
+		goto out;
+	}
+
+	if (landlock_restrict_self(data->ruleset_fd, 0)) {
+		err = 7;
+		goto out;
+	}
+
+	if (kill(data->parent_pid, 0) != -1 || errno != EPERM) {
+		err = 8;
+		goto out;
+	}
+
+	if (write(data->pipe_child, ".", 1) != 1) {
+		err = 9;
+		goto out;
+	}
+
+out:
+	close(data->ruleset_fd);
+	close(data->pipe_child);
+	close(data->pipe_parent);
+	return (void *)err;
+}
+
+/*
+ * Verifies that LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF with
+ * LANDLOCK_RESTRICT_SELF_TSYNC and ruleset_fd=-1 propagates log_subdomains_off
+ * to a sibling thread, suppressing audit logging on domains it subsequently
+ * creates.
+ *
+ * Phase 1 (before TSYNC) acts as an inline baseline: the sibling creates a
+ * domain and triggers a denial that IS logged.
+ *
+ * Phase 2 (after TSYNC) verifies suppression: the sibling stacks another domain
+ * and triggers a denial that is NOT logged.
+ */
+TEST_F(audit, log_subdomains_off_tsync)
+{
+	const struct landlock_ruleset_attr ruleset_attr = {
+		.scoped = LANDLOCK_SCOPE_SIGNAL,
+	};
+	struct audit_records records;
+	struct thread_data child_data;
+	int pipe_child[2], pipe_parent[2];
+	char buffer;
+	pthread_t thread;
+	void *thread_ret;
+
+	child_data.parent_pid = getppid();
+	ASSERT_EQ(0, pipe2(pipe_child, O_CLOEXEC));
+	child_data.pipe_child = pipe_child[1];
+	ASSERT_EQ(0, pipe2(pipe_parent, O_CLOEXEC));
+	child_data.pipe_parent = pipe_parent[0];
+	child_data.ruleset_fd =
+		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+	ASSERT_LE(0, child_data.ruleset_fd);
+
+	ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+
+	/* Creates the sibling thread. */
+	ASSERT_EQ(0, pthread_create(&thread, NULL, thread_sandbox_deny_twice,
+				    &child_data));
+
+	/*
+	 * Phase 1: the sibling creates a domain and triggers a denial before
+	 * any log muting.  This proves the audit path works.
+	 */
+	ASSERT_EQ(1, write(pipe_parent[1], ".", 1));
+	ASSERT_EQ(1, read(pipe_child[0], &buffer, 1));
+
+	/* The denial must be logged. */
+	EXPECT_EQ(0, matches_log_signal(_metadata, self->audit_fd,
+					child_data.parent_pid, NULL));
+
+	/* Drains any remaining records (e.g. domain allocation). */
+	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+
+	/*
+	 * Mutes subdomain logs and propagates to the sibling thread via TSYNC,
+	 * without creating a domain.
+	 */
+	ASSERT_EQ(0, landlock_restrict_self(
+			     -1, LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF |
+					 LANDLOCK_RESTRICT_SELF_TSYNC));
+
+	/*
+	 * Phase 2: the sibling stacks another domain and triggers a denial.
+	 * Because log_subdomains_off was propagated via TSYNC, the new domain
+	 * has log_status=LANDLOCK_LOG_DISABLED.
+	 */
+	ASSERT_EQ(1, write(pipe_parent[1], ".", 1));
+	ASSERT_EQ(1, read(pipe_child[0], &buffer, 1));
+
+	/* No denial record should appear. */
+	EXPECT_EQ(-EAGAIN, matches_log_signal(_metadata, self->audit_fd,
+					      child_data.parent_pid, NULL));
+
+	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+	EXPECT_EQ(0, records.access);
+
+	EXPECT_EQ(0, close(pipe_child[0]));
+	EXPECT_EQ(0, close(pipe_parent[1]));
+	ASSERT_EQ(0, pthread_join(thread, &thread_ret));
+	EXPECT_EQ(NULL, thread_ret);
+}
+
+/*
+ * Verifies that LANDLOCK_RESTRICT_SELF_TSYNC without
+ * LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF overrides a sibling thread's
+ * log_subdomains_off, re-enabling audit logging on domains the sibling
+ * subsequently creates.
+ *
+ * Phase 1: the sibling sets log_subdomains_off, creates a muted domain, and
+ * triggers a denial that is NOT logged.
+ *
+ * Phase 2 (after TSYNC without LOG_SUBDOMAINS_OFF): the sibling stacks another
+ * domain and triggers a denial that IS logged, proving the muting was
+ * overridden.
+ */
+TEST_F(audit, tsync_override_log_subdomains_off)
+{
+	const struct landlock_ruleset_attr ruleset_attr = {
+		.scoped = LANDLOCK_SCOPE_SIGNAL,
+	};
+	struct audit_records records;
+	struct thread_data child_data;
+	int pipe_child[2], pipe_parent[2];
+	char buffer;
+	pthread_t thread;
+	void *thread_ret;
+
+	child_data.parent_pid = getppid();
+	ASSERT_EQ(0, pipe2(pipe_child, O_CLOEXEC));
+	child_data.pipe_child = pipe_child[1];
+	ASSERT_EQ(0, pipe2(pipe_parent, O_CLOEXEC));
+	child_data.pipe_parent = pipe_parent[0];
+	child_data.ruleset_fd =
+		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+	ASSERT_LE(0, child_data.ruleset_fd);
+
+	ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+
+	child_data.mute_subdomains = true;
+
+	/* Creates the sibling thread. */
+	ASSERT_EQ(0, pthread_create(&thread, NULL, thread_sandbox_deny_twice,
+				    &child_data));
+
+	/*
+	 * Phase 1: the sibling mutes subdomain logs, creates a domain, and
+	 * triggers a denial.  The denial must not be logged.
+	 */
+	ASSERT_EQ(1, write(pipe_parent[1], ".", 1));
+	ASSERT_EQ(1, read(pipe_child[0], &buffer, 1));
+
+	EXPECT_EQ(-EAGAIN, matches_log_signal(_metadata, self->audit_fd,
+					      child_data.parent_pid, NULL));
+
+	/* Drains any remaining records. */
+	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+	EXPECT_EQ(0, records.access);
+
+	/*
+	 * Overrides the sibling's log_subdomains_off by calling TSYNC without
+	 * LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF.
+	 */
+	ASSERT_EQ(0, landlock_restrict_self(child_data.ruleset_fd,
+					    LANDLOCK_RESTRICT_SELF_TSYNC));
+
+	/*
+	 * Phase 2: the sibling stacks another domain and triggers a denial.
+	 * Because TSYNC replaced its log_subdomains_off with 0, the new domain
+	 * has log_status=LANDLOCK_LOG_PENDING.
+	 */
+	ASSERT_EQ(1, write(pipe_parent[1], ".", 1));
+	ASSERT_EQ(1, read(pipe_child[0], &buffer, 1));
+
+	/* The denial must be logged. */
+	EXPECT_EQ(0, matches_log_signal(_metadata, self->audit_fd,
+					child_data.parent_pid, NULL));
+
+	EXPECT_EQ(0, close(pipe_child[0]));
+	EXPECT_EQ(0, close(pipe_parent[1]));
+	ASSERT_EQ(0, pthread_join(thread, &thread_ret));
+	EXPECT_EQ(NULL, thread_ret);
+}
+
 FIXTURE(audit_flags)
 {
 	struct audit_filter audit_filter;
diff --git a/tools/testing/selftests/landlock/tsync_test.c b/tools/testing/selftests/landlock/tsync_test.c
index 2b9ad4f154f4..abc290271a1a 100644
--- a/tools/testing/selftests/landlock/tsync_test.c
+++ b/tools/testing/selftests/landlock/tsync_test.c
@@ -247,4 +247,78 @@ TEST(tsync_interrupt)
 	EXPECT_EQ(0, close(ruleset_fd));
 }
 
+/* clang-format off */
+FIXTURE(tsync_without_ruleset) {};
+/* clang-format on */
+
+FIXTURE_VARIANT(tsync_without_ruleset)
+{
+	const __u32 flags;
+	const int expected_errno;
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(tsync_without_ruleset, tsync_only) {
+	/* clang-format on */
+	.flags = LANDLOCK_RESTRICT_SELF_TSYNC,
+	.expected_errno = EBADF,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(tsync_without_ruleset, subdomains_off_same_exec_off) {
+	/* clang-format on */
+	.flags = LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF |
+		 LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF |
+		 LANDLOCK_RESTRICT_SELF_TSYNC,
+	.expected_errno = EBADF,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(tsync_without_ruleset, subdomains_off_new_exec_on) {
+	/* clang-format on */
+	.flags = LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF |
+		 LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON |
+		 LANDLOCK_RESTRICT_SELF_TSYNC,
+	.expected_errno = EBADF,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(tsync_without_ruleset, all_flags) {
+	/* clang-format on */
+	.flags = LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF |
+		 LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON |
+		 LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF |
+		 LANDLOCK_RESTRICT_SELF_TSYNC,
+	.expected_errno = EBADF,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(tsync_without_ruleset, subdomains_off) {
+	/* clang-format on */
+	.flags = LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF |
+		 LANDLOCK_RESTRICT_SELF_TSYNC,
+	.expected_errno = 0,
+};
+
+FIXTURE_SETUP(tsync_without_ruleset)
+{
+}
+
+FIXTURE_TEARDOWN(tsync_without_ruleset)
+{
+}
+
+TEST_F(tsync_without_ruleset, check)
+{
+	int ret;
+
+	ret = landlock_restrict_self(-1, variant->flags);
+	if (variant->expected_errno) {
+		EXPECT_EQ(-1, ret);
+		EXPECT_EQ(variant->expected_errno, errno);
+	} else {
+		EXPECT_EQ(0, ret);
+	}
+}
+
 TEST_HARNESS_MAIN
-- 
2.53.0


^ permalink raw reply related

* [PATCH v1 1/2] landlock: Fix log_subdomains_off inheritance across fork()
From: Mickaël Salaün @ 2026-04-04  8:49 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, stable

hook_cred_transfer() only copies the Landlock security blob when the
source credential has a domain.  This is inconsistent with
landlock_restrict_self() which can set log_subdomains_off on a
credential without creating a domain (via the ruleset_fd=-1 path): the
field is committed but not preserved across fork() because the child's
prepare_creds() calls hook_cred_transfer() which skips the copy when
domain is NULL.

This breaks the documented use case where a process mutes subdomain logs
before forking sandboxed children: the children lose the muting and
their domains produce unexpected audit records.

Fix this by unconditionally copying the Landlock credential blob.
landlock_get_ruleset(NULL) is already a safe no-op.

Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Fixes: ead9079f7569 ("landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---
 security/landlock/cred.c                      |  6 +-
 tools/testing/selftests/landlock/audit_test.c | 88 +++++++++++++++++++
 2 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/security/landlock/cred.c b/security/landlock/cred.c
index 0cb3edde4d18..cc419de75cd6 100644
--- a/security/landlock/cred.c
+++ b/security/landlock/cred.c
@@ -22,10 +22,8 @@ static void hook_cred_transfer(struct cred *const new,
 	const struct landlock_cred_security *const old_llcred =
 		landlock_cred(old);
 
-	if (old_llcred->domain) {
-		landlock_get_ruleset(old_llcred->domain);
-		*landlock_cred(new) = *old_llcred;
-	}
+	landlock_get_ruleset(old_llcred->domain);
+	*landlock_cred(new) = *old_llcred;
 }
 
 static int hook_cred_prepare(struct cred *const new,
diff --git a/tools/testing/selftests/landlock/audit_test.c b/tools/testing/selftests/landlock/audit_test.c
index 46d02d49835a..20099b8667e7 100644
--- a/tools/testing/selftests/landlock/audit_test.c
+++ b/tools/testing/selftests/landlock/audit_test.c
@@ -279,6 +279,94 @@ TEST_F(audit, thread)
 				&audit_tv_default, sizeof(audit_tv_default)));
 }
 
+/*
+ * Verifies that log_subdomains_off set via the ruleset_fd=-1 path (without
+ * creating a domain) is inherited by children across fork().  This exercises
+ * the hook_cred_transfer() fix: the Landlock credential blob must be copied
+ * even when the source credential has no domain.
+ *
+ * Phase 1 (baseline): a child without muting creates a domain and triggers a
+ * denial that IS logged.
+ *
+ * Phase 2 (after muting): the parent mutes subdomain logs, forks another child
+ * who creates a domain and triggers a denial that is NOT logged.
+ */
+TEST_F(audit, log_subdomains_off_fork)
+{
+	const struct landlock_ruleset_attr ruleset_attr = {
+		.scoped = LANDLOCK_SCOPE_SIGNAL,
+	};
+	struct audit_records records;
+	int ruleset_fd, status;
+	pid_t child;
+
+	ruleset_fd =
+		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+	ASSERT_LE(0, ruleset_fd);
+
+	ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+
+	/*
+	 * Phase 1: forks a child that creates a domain and triggers a denial
+	 * before any muting.  This proves the audit path works.
+	 */
+	child = fork();
+	ASSERT_LE(0, child);
+	if (child == 0) {
+		ASSERT_EQ(0, landlock_restrict_self(ruleset_fd, 0));
+		ASSERT_EQ(-1, kill(getppid(), 0));
+		ASSERT_EQ(EPERM, errno);
+		_exit(0);
+		return;
+	}
+
+	ASSERT_EQ(child, waitpid(child, &status, 0));
+	ASSERT_EQ(true, WIFEXITED(status));
+	ASSERT_EQ(0, WEXITSTATUS(status));
+
+	/* The denial must be logged (baseline). */
+	EXPECT_EQ(0, matches_log_signal(_metadata, self->audit_fd, getpid(),
+					NULL));
+
+	/* Drains any remaining records (e.g. domain allocation). */
+	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+
+	/*
+	 * Mutes subdomain logs without creating a domain.  The parent's
+	 * credential has domain=NULL and log_subdomains_off=1.
+	 */
+	ASSERT_EQ(0, landlock_restrict_self(
+			     -1, LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF));
+
+	/*
+	 * Phase 2: forks a child that creates a domain and triggers a denial.
+	 * Because log_subdomains_off was inherited via fork(), the child's
+	 * domain has log_status=LANDLOCK_LOG_DISABLED.
+	 */
+	child = fork();
+	ASSERT_LE(0, child);
+	if (child == 0) {
+		ASSERT_EQ(0, landlock_restrict_self(ruleset_fd, 0));
+		ASSERT_EQ(-1, kill(getppid(), 0));
+		ASSERT_EQ(EPERM, errno);
+		_exit(0);
+		return;
+	}
+
+	ASSERT_EQ(child, waitpid(child, &status, 0));
+	ASSERT_EQ(true, WIFEXITED(status));
+	ASSERT_EQ(0, WEXITSTATUS(status));
+
+	/* No denial record should appear. */
+	EXPECT_EQ(-EAGAIN, matches_log_signal(_metadata, self->audit_fd,
+					      getpid(), NULL));
+
+	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+	EXPECT_EQ(0, records.access);
+
+	EXPECT_EQ(0, close(ruleset_fd));
+}
+
 FIXTURE(audit_flags)
 {
 	struct audit_filter audit_filter;
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH v4 2/3] lsm: add backing_file LSM hooks
From: Paul Moore @ 2026-04-03 21:14 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Gao Xiang, Christian Brauner
In-Reply-To: <CAOQ4uxgd1wo9U32L_sQLfswY93LRp4yPzkJvKtj=wDKi8h13gg@mail.gmail.com>

On Fri, Apr 3, 2026 at 2:12 AM Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, Apr 3, 2026 at 5:09 AM Paul Moore <paul@paul-moore.com> wrote:
> >
> > Stacked filesystems such as overlayfs do not currently provide the
> > necessary mechanisms for LSMs to properly enforce access controls on the
> > mmap() and mprotect() operations.  In order to resolve this gap, a LSM
> > security blob is being added to the backing_file struct and the following
> > new LSM hooks are being created:
> >
> >  security_backing_file_alloc()
> >  security_backing_file_free()
> >  security_mmap_backing_file()
> >
> > The first two hooks are to manage the lifecycle of the LSM security blob
> > in the backing_file struct, while the third provides a new mmap() access
> > control point for the underlying backing file.  It is also expected that
> > LSMs will likely want to update their security_file_mprotect() callback
> > to address issues with their mprotect() controls, but that does not
> > require a change to the security_file_mprotect() LSM hook.
> >
> > There are a three other small changes to support these new LSM hooks:
> > * Pass the user file associated with a backing file down to
> > alloc_empty_backing_file() so it can be included in the
> > security_backing_file_alloc() hook.
> > * Add getter and setter functions for the backing_file struct LSM blob
> > as the backing_file struct remains private to fs/file_table.c.
> > * Constify the file struct field in the LSM common_audit_data struct to
> > better support LSMs that need to pass a const file struct pointer into
> > the common LSM audit code.
> >
> > Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL()
> > and supplying a fixup.
> >
> > Cc: stable@vger.kernel.org
> > Cc: linux-fsdevel@vger.kernel.org
> > Cc: linux-unionfs@vger.kernel.org
> > Cc: linux-erofs@lists.ozlabs.org
> > Signed-off-by: Paul Moore <paul@paul-moore.com>
>
> That looks nicer.
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>
> Thanks,
> Amir.

Thanks for refreshing your review.  Since we are at the end of -rc6,
it probably doesn't make much sense to put this in lsm/stable-7.0; I'm
going to merge this into lsm/dev which should give us at least one
week in linux-next before the v7.1 merge window opens.  If others want
to add their ACKs/Reviewed-by during that time, I'll update the
branch.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v3 0/5] Fix Landlock audit test flakiness
From: Mickaël Salaün @ 2026-04-03 17:08 UTC (permalink / raw)
  To: Günther Noack
  Cc: Günther Noack, linux-security-module, Justin Suess,
	Tingmao Wang
In-Reply-To: <20260402.eb5c4e85f472@gnoack.org>

On Thu, Apr 02, 2026 at 10:52:46PM +0200, Günther Noack wrote:
> Hello!
> 
> On Thu, Apr 02, 2026 at 09:26:01PM +0200, Mickaël Salaün wrote:
> > This series fixes two classes of audit selftest failures plus two minor
> > bugs in the audit test helpers.
> > 
> > The main issue is that domain deallocation audit records are emitted
> > asynchronously from kworker threads and can arrive after a previous
> > test's socket has been closed.  This causes two distinct failure modes:
> > 
> > - audit_match_record() picks up a stale deallocation record from a
> >   previous test instead of the expected one, causing a domain ID
> >   mismatch.  The audit.layers test (which reads 16 deallocation records
> >   in sequence) is particularly vulnerable because the large read window
> >   allows stale records to interleave.  Patch 4 fixes this by filtering
> >   deallocation records by domain ID and skipping type-matching records
> >   with wrong content patterns.
> > 
> > - audit_count_records() counts stale deallocation records from a
> >   previous test, incrementing records.domain from the expected 0 to 1.
> >   Patch 3 fixes this by draining stale records at audit_init() time and
> >   removing records.domain == 0 checks that are not preceded by
> >   audit_match_record() calls (which would consume stale records).
> > 
> > These races are more likely to manifest when additional instrumentation
> > changes kworker timing in the deallocation path (e.g. with the upcoming
> > Landlock tracepoints work).
> > 
> > The two minor fixes (patches 1-2) correct a snprintf truncation check
> > off-by-one and socket file descriptor leaks on error paths in
> > audit_init(), audit_init_with_exe_filter(), and audit_cleanup().
> > Patch 5 fixes a __u64 format warning reported by the kbuild bot on
> > powerpc64.
> > 
> > Patch 1 is an exact subset of the v1 combined patch, which is why it
> > carries the Reviewed-by tag.  Patches 2 and 3 extend beyond what was in
> > v1, so the Reviewed-by is not carried.  Patches 4 and 5 are new.
> > 
> > Changes since v2:
> > https://lore.kernel.org/r/20260401161503.1136946-1-mic@digikod.net
> > - Patches 4-5: fix __u64 format warnings on powerpc64 (cast to unsigned
> >   long long for %llx).  Patch 5 is new.
> > 
> > Changes since v1:
> > https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
> > - Split the combined drain fix into four separate patches.
> > - Patch 2: extend fd leak fix to audit_init_with_exe_filter() and
> >   audit_cleanup().
> > - Patch 3: also remove domain checks from audit.trace and
> >   scoped_audit.connect_to_child, document constraint, explain why a
> >   longer drain timeout was rejected.
> > - Patch 4: new, add domain ID filtering and timeout management to
> >   matches_log_domain_deallocated(), skip stale records in
> >   audit_match_record().
> > 
> > Mickaël Salaün (5):
> >   selftests/landlock: Fix snprintf truncation checks in audit helpers
> >   selftests/landlock: Fix socket file descriptor leaks in audit helpers
> >   selftests/landlock: Drain stale audit records on init
> >   selftests/landlock: Skip stale records in audit_match_record()
> >   selftests/landlock: Fix format warning for __u64 in net_test
> > 
> >  tools/testing/selftests/landlock/audit.h      | 133 ++++++++++++++----
> >  tools/testing/selftests/landlock/audit_test.c |  36 ++---
> >  tools/testing/selftests/landlock/net_test.c   |   2 +-
> >  .../testing/selftests/landlock/ptrace_test.c  |   1 -
> >  .../landlock/scoped_abstract_unix_test.c      |   1 -
> >  5 files changed, 119 insertions(+), 54 deletions(-)
> > 
> > -- 
> > 2.53.0
> > 
> 
> I am still getting flaky audit tests even with these patches, I am
> afraid.  It differs which of these tests is flaking, some of them
> still do, for example:
> 
> #  RUN           audit_layout1.remove_dir ...
> # fs_test.c:7281:remove_dir:Expected 0 (0) == matches_log_fs(_metadata, self->audit_fd, "fs\\.remove_dir", dir_s1d2) (-11)
> # remove_dir: Test failed
> #          ❌ FAIL  audit_layout1.remove_dir
> not ok 191 audit_layout1.remove_dir
> #  RUN           audit_layout1.read_dir ...
> #            ✅ OK  audit_layout1.read_dir
> ok 192 audit_layout1.read_dir
> #  RUN           audit_layout1.read_file ...
> #            ✅ OK  audit_layout1.read_file
> ok 193 audit_layout1.read_file
> #  RUN           audit_layout1.write_file ...
> # fs_test.c:7221:write_file:Expected 0 (0) == matches_log_fs(_metadata, self->audit_fd, "fs\\.write_file", file1_s1d1) (-11)
> # fs_test.c:7224:write_file:Expected 0 (0) == records.access (1)
> # write_file: Test failed
> #          ❌ FAIL  audit_layout1.write_file
> not ok 194 audit_layout1.write_file

I never hit these issues and I cannot reproduce them.  This patch fixes
the async events (i.e. domain drops).

You can try to increase audit_tv_default.

> 
> My kernel config is this:
> 
>     make defconfig
>     make kvm_guest.config
>     KCONFIG_CONFIG="${KBUILD_OUTPUT}/.config" ./scripts/kconfig/merge_config.sh "${KBUILD_OUTPUT}/.config" tools/testing/selftests/landlock/config
>     make debug.config
>     echo "CONFIG_RANDOMIZE_BASE=n" >> "${KBUILD_OUTPUT}/.config"
>     make olddefconfig
> 
> and then I run the selftests in Qemu with these flags:
> 
> qemu-system-x86_64 \
>     -nographic \
>     -m 4G \
>     -enable-kvm \
>     -append "console=ttyS0 lsm=landlock no_hash_pointers" \
>     -kernel "${KBUILD_OUTPUT}/arch/x86/boot/bzImage" \
>     -initrd "${INITRAMFS}"
> 
> This is using my own selftest runner scripts which builds an initramfs
> with the statically linked selftests.

Can you try with the check-linux.sh build kselftest (which also set a
lot of debug options)?  You can also try with qemu if you set
ARCH=x86_64

> 
> Do you have a hunch what might be missing there?  In the test run
> above, I have applied your V4 patch set on top of the current master,
> 5619b098e2fbf3a23bf13d91897056a1fe238c6d ("Merge tag 'for-7.0-rc6-tag'
> of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux").

This is weird because this is related to FS events, and they should be
(almost) synchronous events.  Maybe the audit event pipeline is made
very slow because of some audit options but still...

Anyway, this is not what this patch fixes, but we should fix your issues
as well.

^ permalink raw reply

* Re: [PATCH v4 3/3] selinux: fix overlayfs mmap() and mprotect() access checks
From: Amir Goldstein @ 2026-04-03  6:17 UTC (permalink / raw)
  To: Paul Moore
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Gao Xiang, Christian Brauner
In-Reply-To: <20260403030848.731867-8-paul@paul-moore.com>

On Fri, Apr 3, 2026 at 5:09 AM Paul Moore <paul@paul-moore.com> wrote:
>
> The existing SELinux security model for overlayfs is to allow access if
> the current task is able to access the top level file (the "user" file)
> and the mounter's credentials are sufficient to access the lower
> level file (the "backing" file).  Unfortunately, the current code does
> not properly enforce these access controls for both mmap() and mprotect()
> operations on overlayfs filesystems.
>
> This patch makes use of the newly created security_mmap_backing_file()
> LSM hook to provide the missing backing file enforcement for mmap()
> operations, and leverages the backing file API and new LSM blob to
> provide the necessary information to properly enforce the mprotect()
> access controls.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Paul Moore <paul@paul-moore.com>

Can't say much about selinux implementation, but
for the use of backing file API and the concept solution

Acked-by: Amir Goldstein <amir73il@gmail.com>

Thanks,
Amir.

> ---
>  security/selinux/hooks.c          | 256 +++++++++++++++++++++---------
>  security/selinux/include/objsec.h |  11 ++
>  2 files changed, 196 insertions(+), 71 deletions(-)
>
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index d8224ea113d1..76e0fb7dcb36 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -1745,6 +1745,60 @@ static inline int file_path_has_perm(const struct cred *cred,
>  static int bpf_fd_pass(const struct file *file, u32 sid);
>  #endif
>
> +static int __file_has_perm(const struct cred *cred, const struct file *file,
> +                          u32 av, bool bf_user_file)
> +
> +{
> +       struct common_audit_data ad;
> +       struct inode *inode;
> +       u32 ssid = cred_sid(cred);
> +       u32 tsid_fd;
> +       int rc;
> +
> +       if (bf_user_file) {
> +               struct backing_file_security_struct *bfsec;
> +               const struct path *path;
> +
> +               if (WARN_ON(!(file->f_mode & FMODE_BACKING)))
> +                       return -EIO;
> +
> +               bfsec = selinux_backing_file(file);
> +               path = backing_file_user_path(file);
> +               tsid_fd = bfsec->uf_sid;
> +               inode = d_inode(path->dentry);
> +
> +               ad.type = LSM_AUDIT_DATA_PATH;
> +               ad.u.path = *path;
> +       } else {
> +               struct file_security_struct *fsec = selinux_file(file);
> +
> +               tsid_fd = fsec->sid;
> +               inode = file_inode(file);
> +
> +               ad.type = LSM_AUDIT_DATA_FILE;
> +               ad.u.file = file;
> +       }
> +
> +       if (ssid != tsid_fd) {
> +               rc = avc_has_perm(ssid, tsid_fd, SECCLASS_FD, FD__USE, &ad);
> +               if (rc)
> +                       return rc;
> +       }
> +
> +#ifdef CONFIG_BPF_SYSCALL
> +       /* regardless of backing vs user file, use the underlying file here */
> +       rc = bpf_fd_pass(file, ssid);
> +       if (rc)
> +               return rc;
> +#endif
> +
> +       /* av is zero if only checking access to the descriptor. */
> +       if (av)
> +               return inode_has_perm(cred, inode, av, &ad);
> +
> +       return 0;
> +}
> +
>  /* Check whether a task can use an open file descriptor to
>     access an inode in a given way.  Check access to the
>     descriptor itself, and then use dentry_has_perm to
> @@ -1753,41 +1807,10 @@ static int bpf_fd_pass(const struct file *file, u32 sid);
>     has the same SID as the process.  If av is zero, then
>     access to the file is not checked, e.g. for cases
>     where only the descriptor is affected like seek. */
> -static int file_has_perm(const struct cred *cred,
> -                        struct file *file,
> -                        u32 av)
> +static inline int file_has_perm(const struct cred *cred,
> +                               const struct file *file, u32 av)
>  {
> -       struct file_security_struct *fsec = selinux_file(file);
> -       struct inode *inode = file_inode(file);
> -       struct common_audit_data ad;
> -       u32 sid = cred_sid(cred);
> -       int rc;
> -
> -       ad.type = LSM_AUDIT_DATA_FILE;
> -       ad.u.file = file;
> -
> -       if (sid != fsec->sid) {
> -               rc = avc_has_perm(sid, fsec->sid,
> -                                 SECCLASS_FD,
> -                                 FD__USE,
> -                                 &ad);
> -               if (rc)
> -                       goto out;
> -       }
> -
> -#ifdef CONFIG_BPF_SYSCALL
> -       rc = bpf_fd_pass(file, cred_sid(cred));
> -       if (rc)
> -               return rc;
> -#endif
> -
> -       /* av is zero if only checking access to the descriptor. */
> -       rc = 0;
> -       if (av)
> -               rc = inode_has_perm(cred, inode, av, &ad);
> -
> -out:
> -       return rc;
> +       return __file_has_perm(cred, file, av, false);
>  }
>
>  /*
> @@ -3825,6 +3848,17 @@ static int selinux_file_alloc_security(struct file *file)
>         return 0;
>  }
>
> +static int selinux_backing_file_alloc(struct file *backing_file,
> +                                     const struct file *user_file)
> +{
> +       struct backing_file_security_struct *bfsec;
> +
> +       bfsec = selinux_backing_file(backing_file);
> +       bfsec->uf_sid = selinux_file(user_file)->sid;
> +
> +       return 0;
> +}
> +
>  /*
>   * Check whether a task has the ioctl permission and cmd
>   * operation to an inode.
> @@ -3942,42 +3976,55 @@ static int selinux_file_ioctl_compat(struct file *file, unsigned int cmd,
>
>  static int default_noexec __ro_after_init;
>
> -static int file_map_prot_check(struct file *file, unsigned long prot, int shared)
> +static int __file_map_prot_check(const struct cred *cred,
> +                                const struct file *file, unsigned long prot,
> +                                bool shared, bool bf_user_file)
>  {
> -       const struct cred *cred = current_cred();
> -       u32 sid = cred_sid(cred);
> -       int rc = 0;
> +       struct inode *inode = NULL;
> +       bool prot_exec = prot & PROT_EXEC;
> +       bool prot_write = prot & PROT_WRITE;
> +
> +       if (file) {
> +               if (bf_user_file)
> +                       inode = d_inode(backing_file_user_path(file)->dentry);
> +               else
> +                       inode = file_inode(file);
> +       }
> +
> +       if (default_noexec && prot_exec &&
> +           (!file || IS_PRIVATE(inode) || (!shared && prot_write))) {
> +               int rc;
> +               u32 sid = cred_sid(cred);
>
> -       if (default_noexec &&
> -           (prot & PROT_EXEC) && (!file || IS_PRIVATE(file_inode(file)) ||
> -                                  (!shared && (prot & PROT_WRITE)))) {
>                 /*
> -                * We are making executable an anonymous mapping or a
> -                * private file mapping that will also be writable.
> -                * This has an additional check.
> +                * We are making executable an anonymous mapping or a private
> +                * file mapping that will also be writable.
>                  */
> -               rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
> -                                 PROCESS__EXECMEM, NULL);
> +               rc = avc_has_perm(sid, sid, SECCLASS_PROCESS, PROCESS__EXECMEM,
> +                                 NULL);
>                 if (rc)
> -                       goto error;
> +                       return rc;
>         }
>
>         if (file) {
> -               /* read access is always possible with a mapping */
> +               /* "read" always possible, "write" only if shared */
>                 u32 av = FILE__READ;
> -
> -               /* write access only matters if the mapping is shared */
> -               if (shared && (prot & PROT_WRITE))
> +               if (shared && prot_write)
>                         av |= FILE__WRITE;
> -
> -               if (prot & PROT_EXEC)
> +               if (prot_exec)
>                         av |= FILE__EXECUTE;
>
> -               return file_has_perm(cred, file, av);
> +               return __file_has_perm(cred, file, av, bf_user_file);
>         }
>
> -error:
> -       return rc;
> +       return 0;
> +}
> +
> +static inline int file_map_prot_check(const struct cred *cred,
> +                                     const struct file *file,
> +                                     unsigned long prot, bool shared)
> +{
> +       return __file_map_prot_check(cred, file, prot, shared, false);
>  }
>
>  static int selinux_mmap_addr(unsigned long addr)
> @@ -3993,36 +4040,80 @@ static int selinux_mmap_addr(unsigned long addr)
>         return rc;
>  }
>
> -static int selinux_mmap_file(struct file *file,
> -                            unsigned long reqprot __always_unused,
> -                            unsigned long prot, unsigned long flags)
> +static int selinux_mmap_file_common(const struct cred *cred, struct file *file,
> +                                   unsigned long prot, bool shared)
>  {
> -       struct common_audit_data ad;
> -       int rc;
> -
>         if (file) {
> +               int rc;
> +               struct common_audit_data ad;
> +
>                 ad.type = LSM_AUDIT_DATA_FILE;
>                 ad.u.file = file;
> -               rc = inode_has_perm(current_cred(), file_inode(file),
> -                                   FILE__MAP, &ad);
> +               rc = inode_has_perm(cred, file_inode(file), FILE__MAP, &ad);
>                 if (rc)
>                         return rc;
>         }
>
> -       return file_map_prot_check(file, prot,
> -                                  (flags & MAP_TYPE) == MAP_SHARED);
> +       return file_map_prot_check(cred, file, prot, shared);
> +}
> +
> +static int selinux_mmap_file(struct file *file,
> +                            unsigned long reqprot __always_unused,
> +                            unsigned long prot, unsigned long flags)
> +{
> +       return selinux_mmap_file_common(current_cred(), file, prot,
> +                                       (flags & MAP_TYPE) == MAP_SHARED);
> +}
> +
> +/**
> + * selinux_mmap_backing_file - Check mmap permissions on a backing file
> + * @vma: memory region
> + * @backing_file: stacked filesystem backing file
> + * @user_file: user visible file
> + *
> + * This is called after selinux_mmap_file() on stacked filesystems, and it
> + * is this function's responsibility to verify access to @backing_file and
> + * setup the SELinux state for possible later use in the mprotect() code path.
> + *
> + * By the time this function is called, mmap() access to @user_file has already
> + * been authorized and @vma->vm_file has been set to point to @backing_file.
> + *
> + * Return zero on success, negative values otherwise.
> + */
> +static int selinux_mmap_backing_file(struct vm_area_struct *vma,
> +                                    struct file *backing_file,
> +                                    struct file *user_file __always_unused)
> +{
> +       unsigned long prot = 0;
> +
> +       /* translate vma->vm_flags perms into PROT perms */
> +       if (vma->vm_flags & VM_READ)
> +               prot |= PROT_READ;
> +       if (vma->vm_flags & VM_WRITE)
> +               prot |= PROT_WRITE;
> +       if (vma->vm_flags & VM_EXEC)
> +               prot |= PROT_EXEC;
> +
> +       return selinux_mmap_file_common(backing_file->f_cred, backing_file,
> +                                       prot, vma->vm_flags & VM_SHARED);
>  }
>
>  static int selinux_file_mprotect(struct vm_area_struct *vma,
>                                  unsigned long reqprot __always_unused,
>                                  unsigned long prot)
>  {
> +       int rc;
>         const struct cred *cred = current_cred();
>         u32 sid = cred_sid(cred);
> +       const struct file *file = vma->vm_file;
> +       bool backing_file;
> +       bool shared = vma->vm_flags & VM_SHARED;
> +
> +       /* check if we need to trigger the "backing files are awful" mode */
> +       backing_file = file && (file->f_mode & FMODE_BACKING);
>
>         if (default_noexec &&
>             (prot & PROT_EXEC) && !(vma->vm_flags & VM_EXEC)) {
> -               int rc = 0;
>                 /*
>                  * We don't use the vma_is_initial_heap() helper as it has
>                  * a history of problems and is currently broken on systems
> @@ -4036,11 +4127,15 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
>                     vma->vm_end <= vma->vm_mm->brk) {
>                         rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
>                                           PROCESS__EXECHEAP, NULL);
> -               } else if (!vma->vm_file && (vma_is_initial_stack(vma) ||
> +                       if (rc)
> +                               return rc;
> +               } else if (!file && (vma_is_initial_stack(vma) ||
>                             vma_is_stack_for_current(vma))) {
>                         rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
>                                           PROCESS__EXECSTACK, NULL);
> -               } else if (vma->vm_file && vma->anon_vma) {
> +                       if (rc)
> +                               return rc;
> +               } else if (file && vma->anon_vma) {
>                         /*
>                          * We are making executable a file mapping that has
>                          * had some COW done. Since pages might have been
> @@ -4048,13 +4143,29 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
>                          * modified content.  This typically should only
>                          * occur for text relocations.
>                          */
> -                       rc = file_has_perm(cred, vma->vm_file, FILE__EXECMOD);
> +                       rc = __file_has_perm(cred, file, FILE__EXECMOD,
> +                                            backing_file);
> +                       if (rc)
> +                               return rc;
> +                       if (backing_file) {
> +                               rc = file_has_perm(file->f_cred, file,
> +                                                  FILE__EXECMOD);
> +                               if (rc)
> +                                       return rc;
> +                       }
>                 }
> +       }
> +
> +       rc = __file_map_prot_check(cred, file, prot, shared, backing_file);
> +       if (rc)
> +               return rc;
> +       if (backing_file) {
> +               rc = file_map_prot_check(file->f_cred, file, prot, shared);
>                 if (rc)
>                         return rc;
>         }
>
> -       return file_map_prot_check(vma->vm_file, prot, vma->vm_flags&VM_SHARED);
> +       return 0;
>  }
>
>  static int selinux_file_lock(struct file *file, unsigned int cmd)
> @@ -7393,6 +7504,7 @@ struct lsm_blob_sizes selinux_blob_sizes __ro_after_init = {
>         .lbs_cred = sizeof(struct cred_security_struct),
>         .lbs_task = sizeof(struct task_security_struct),
>         .lbs_file = sizeof(struct file_security_struct),
> +       .lbs_backing_file = sizeof(struct backing_file_security_struct),
>         .lbs_inode = sizeof(struct inode_security_struct),
>         .lbs_ipc = sizeof(struct ipc_security_struct),
>         .lbs_key = sizeof(struct key_security_struct),
> @@ -7498,9 +7610,11 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = {
>
>         LSM_HOOK_INIT(file_permission, selinux_file_permission),
>         LSM_HOOK_INIT(file_alloc_security, selinux_file_alloc_security),
> +       LSM_HOOK_INIT(backing_file_alloc, selinux_backing_file_alloc),
>         LSM_HOOK_INIT(file_ioctl, selinux_file_ioctl),
>         LSM_HOOK_INIT(file_ioctl_compat, selinux_file_ioctl_compat),
>         LSM_HOOK_INIT(mmap_file, selinux_mmap_file),
> +       LSM_HOOK_INIT(mmap_backing_file, selinux_mmap_backing_file),
>         LSM_HOOK_INIT(mmap_addr, selinux_mmap_addr),
>         LSM_HOOK_INIT(file_mprotect, selinux_file_mprotect),
>         LSM_HOOK_INIT(file_lock, selinux_file_lock),
> diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
> index 5bddd28ea5cb..b19e5d978e82 100644
> --- a/security/selinux/include/objsec.h
> +++ b/security/selinux/include/objsec.h
> @@ -88,6 +88,10 @@ struct file_security_struct {
>         u32 pseqno; /* Policy seqno at the time of file open */
>  };
>
> +struct backing_file_security_struct {
> +       u32 uf_sid; /* associated user file fsec->sid */
> +};
> +
>  struct superblock_security_struct {
>         u32 sid; /* SID of file system superblock */
>         u32 def_sid; /* default SID for labeling */
> @@ -195,6 +199,13 @@ static inline struct file_security_struct *selinux_file(const struct file *file)
>         return file->f_security + selinux_blob_sizes.lbs_file;
>  }
>
> +static inline struct backing_file_security_struct *
> +selinux_backing_file(const struct file *backing_file)
> +{
> +       void *blob = backing_file_security(backing_file);
> +       return blob + selinux_blob_sizes.lbs_backing_file;
> +}
> +
>  static inline struct inode_security_struct *
>  selinux_inode(const struct inode *inode)
>  {
> --
> 2.53.0
>

^ permalink raw reply

* Re: [PATCH v4 2/3] lsm: add backing_file LSM hooks
From: Amir Goldstein @ 2026-04-03  6:12 UTC (permalink / raw)
  To: Paul Moore
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Gao Xiang, Christian Brauner
In-Reply-To: <20260403030848.731867-7-paul@paul-moore.com>

On Fri, Apr 3, 2026 at 5:09 AM Paul Moore <paul@paul-moore.com> wrote:
>
> Stacked filesystems such as overlayfs do not currently provide the
> necessary mechanisms for LSMs to properly enforce access controls on the
> mmap() and mprotect() operations.  In order to resolve this gap, a LSM
> security blob is being added to the backing_file struct and the following
> new LSM hooks are being created:
>
>  security_backing_file_alloc()
>  security_backing_file_free()
>  security_mmap_backing_file()
>
> The first two hooks are to manage the lifecycle of the LSM security blob
> in the backing_file struct, while the third provides a new mmap() access
> control point for the underlying backing file.  It is also expected that
> LSMs will likely want to update their security_file_mprotect() callback
> to address issues with their mprotect() controls, but that does not
> require a change to the security_file_mprotect() LSM hook.
>
> There are a three other small changes to support these new LSM hooks:
> * Pass the user file associated with a backing file down to
> alloc_empty_backing_file() so it can be included in the
> security_backing_file_alloc() hook.
> * Add getter and setter functions for the backing_file struct LSM blob
> as the backing_file struct remains private to fs/file_table.c.
> * Constify the file struct field in the LSM common_audit_data struct to
> better support LSMs that need to pass a const file struct pointer into
> the common LSM audit code.
>
> Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL()
> and supplying a fixup.
>
> Cc: stable@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-erofs@lists.ozlabs.org
> Signed-off-by: Paul Moore <paul@paul-moore.com>

That looks nicer.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Thanks,
Amir.

> ---
>  fs/backing-file.c             |  18 ++++--
>  fs/erofs/ishare.c             |  10 +++-
>  fs/file_table.c               |  27 +++++++--
>  fs/fuse/passthrough.c         |   2 +-
>  fs/internal.h                 |   3 +-
>  fs/overlayfs/dir.c            |   2 +-
>  fs/overlayfs/file.c           |   2 +-
>  include/linux/backing-file.h  |   4 +-
>  include/linux/fs.h            |  13 +++++
>  include/linux/lsm_audit.h     |   2 +-
>  include/linux/lsm_hook_defs.h |   5 ++
>  include/linux/lsm_hooks.h     |   1 +
>  include/linux/security.h      |  22 ++++++++
>  security/lsm.h                |   1 +
>  security/lsm_init.c           |   9 +++
>  security/security.c           | 102 ++++++++++++++++++++++++++++++++++
>  16 files changed, 206 insertions(+), 17 deletions(-)
>
> diff --git a/fs/backing-file.c b/fs/backing-file.c
> index 45da8600d564..1f3bbfc75882 100644
> --- a/fs/backing-file.c
> +++ b/fs/backing-file.c
> @@ -12,6 +12,7 @@
>  #include <linux/backing-file.h>
>  #include <linux/splice.h>
>  #include <linux/mm.h>
> +#include <linux/security.h>
>
>  #include "internal.h"
>
> @@ -29,14 +30,15 @@
>   * returned file into a container structure that also stores the stacked
>   * file's path, which can be retrieved using backing_file_user_path().
>   */
> -struct file *backing_file_open(const struct path *user_path, int flags,
> +struct file *backing_file_open(const struct file *user_file, int flags,
>                                const struct path *real_path,
>                                const struct cred *cred)
>  {
> +       const struct path *user_path = &user_file->f_path;
>         struct file *f;
>         int error;
>
> -       f = alloc_empty_backing_file(flags, cred);
> +       f = alloc_empty_backing_file(flags, cred, user_file);
>         if (IS_ERR(f))
>                 return f;
>
> @@ -52,15 +54,16 @@ struct file *backing_file_open(const struct path *user_path, int flags,
>  }
>  EXPORT_SYMBOL_GPL(backing_file_open);
>
> -struct file *backing_tmpfile_open(const struct path *user_path, int flags,
> +struct file *backing_tmpfile_open(const struct file *user_file, int flags,
>                                   const struct path *real_parentpath,
>                                   umode_t mode, const struct cred *cred)
>  {
>         struct mnt_idmap *real_idmap = mnt_idmap(real_parentpath->mnt);
> +       const struct path *user_path = &user_file->f_path;
>         struct file *f;
>         int error;
>
> -       f = alloc_empty_backing_file(flags, cred);
> +       f = alloc_empty_backing_file(flags, cred, user_file);
>         if (IS_ERR(f))
>                 return f;
>
> @@ -336,8 +339,13 @@ int backing_file_mmap(struct file *file, struct vm_area_struct *vma,
>
>         vma_set_file(vma, file);
>
> -       scoped_with_creds(ctx->cred)
> +       scoped_with_creds(ctx->cred) {
> +               ret = security_mmap_backing_file(vma, file, user_file);
> +               if (ret)
> +                       return ret;
> +
>                 ret = vfs_mmap(vma->vm_file, vma);
> +       }
>
>         if (ctx->accessed)
>                 ctx->accessed(user_file);
> diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
> index ec433bacc592..6ed66b17359b 100644
> --- a/fs/erofs/ishare.c
> +++ b/fs/erofs/ishare.c
> @@ -4,6 +4,7 @@
>   */
>  #include <linux/xxhash.h>
>  #include <linux/mount.h>
> +#include <linux/security.h>
>  #include "internal.h"
>  #include "xattr.h"
>
> @@ -106,7 +107,8 @@ static int erofs_ishare_file_open(struct inode *inode, struct file *file)
>
>         if (file->f_flags & O_DIRECT)
>                 return -EINVAL;
> -       realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred());
> +       realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred(),
> +                                           file);
>         if (IS_ERR(realfile))
>                 return PTR_ERR(realfile);
>         ihold(sharedinode);
> @@ -150,8 +152,14 @@ static ssize_t erofs_ishare_file_read_iter(struct kiocb *iocb,
>  static int erofs_ishare_mmap(struct file *file, struct vm_area_struct *vma)
>  {
>         struct file *realfile = file->private_data;
> +       int err;
>
>         vma_set_file(vma, realfile);
> +
> +       err = security_mmap_backing_file(vma, realfile, file);
> +       if (err)
> +               return err;
> +
>         return generic_file_readonly_mmap(file, vma);
>  }
>
> diff --git a/fs/file_table.c b/fs/file_table.c
> index 3b3792903185..d19d879b6efc 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -50,6 +50,9 @@ struct backing_file {
>                 struct path user_path;
>                 freeptr_t bf_freeptr;
>         };
> +#ifdef CONFIG_SECURITY
> +       void *security;
> +#endif
>  };
>
>  #define backing_file(f) container_of(f, struct backing_file, file)
> @@ -66,8 +69,21 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
>  }
>  EXPORT_SYMBOL_GPL(backing_file_set_user_path);
>
> +#ifdef CONFIG_SECURITY
> +void *backing_file_security(const struct file *f)
> +{
> +       return backing_file(f)->security;
> +}
> +
> +void backing_file_set_security(struct file *f, void *security)
> +{
> +       backing_file(f)->security = security;
> +}
> +#endif /* CONFIG_SECURITY */
> +
>  static inline void backing_file_free(struct backing_file *ff)
>  {
> +       security_backing_file_free(&ff->file);
>         path_put(&ff->user_path);
>         kmem_cache_free(bfilp_cachep, ff);
>  }
> @@ -288,10 +304,12 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
>         return f;
>  }
>
> -static int init_backing_file(struct backing_file *ff)
> +static int init_backing_file(struct backing_file *ff,
> +                            const struct file *user_file)
>  {
>         memset(&ff->user_path, 0, sizeof(ff->user_path));
> -       return 0;
> +       backing_file_set_security(&ff->file, NULL);
> +       return security_backing_file_alloc(&ff->file, user_file);
>  }
>
>  /*
> @@ -301,7 +319,8 @@ static int init_backing_file(struct backing_file *ff)
>   * This is only for kernel internal use, and the allocate file must not be
>   * installed into file tables or such.
>   */
> -struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
> +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> +                                     const struct file *user_file)
>  {
>         struct backing_file *ff;
>         int error;
> @@ -318,7 +337,7 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
>
>         /* The f_mode flags must be set before fput(). */
>         ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
> -       error = init_backing_file(ff);
> +       error = init_backing_file(ff, user_file);
>         if (unlikely(error)) {
>                 fput(&ff->file);
>                 return ERR_PTR(error);
> diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c
> index 72de97c03d0e..f2d08ac2459b 100644
> --- a/fs/fuse/passthrough.c
> +++ b/fs/fuse/passthrough.c
> @@ -167,7 +167,7 @@ struct fuse_backing *fuse_passthrough_open(struct file *file, int backing_id)
>                 goto out;
>
>         /* Allocate backing file per fuse file to store fuse path */
> -       backing_file = backing_file_open(&file->f_path, file->f_flags,
> +       backing_file = backing_file_open(file, file->f_flags,
>                                          &fb->file->f_path, fb->cred);
>         err = PTR_ERR(backing_file);
>         if (IS_ERR(backing_file)) {
> diff --git a/fs/internal.h b/fs/internal.h
> index cbc384a1aa09..77e90e4124e0 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -106,7 +106,8 @@ extern void chroot_fs_refs(const struct path *, const struct path *);
>   */
>  struct file *alloc_empty_file(int flags, const struct cred *cred);
>  struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred);
> -struct file *alloc_empty_backing_file(int flags, const struct cred *cred);
> +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> +                                     const struct file *user_file);
>  void backing_file_set_user_path(struct file *f, const struct path *path);
>
>  static inline void file_put_write_access(struct file *file)
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index ff3dbd1ca61f..f2f20a611af3 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -1374,7 +1374,7 @@ static int ovl_create_tmpfile(struct file *file, struct dentry *dentry,
>                                 return PTR_ERR(cred);
>
>                         ovl_path_upper(dentry->d_parent, &realparentpath);
> -                       realfile = backing_tmpfile_open(&file->f_path, flags, &realparentpath,
> +                       realfile = backing_tmpfile_open(file, flags, &realparentpath,
>                                                         mode, current_cred());
>                         err = PTR_ERR_OR_ZERO(realfile);
>                         pr_debug("tmpfile/open(%pd2, 0%o) = %i\n", realparentpath.dentry, mode, err);
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 97bed2286030..27cc07738f33 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -48,7 +48,7 @@ static struct file *ovl_open_realfile(const struct file *file,
>                         if (!inode_owner_or_capable(real_idmap, realinode))
>                                 flags &= ~O_NOATIME;
>
> -                       realfile = backing_file_open(file_user_path(file),
> +                       realfile = backing_file_open(file,
>                                                      flags, realpath, current_cred());
>                 }
>         }
> diff --git a/include/linux/backing-file.h b/include/linux/backing-file.h
> index 1476a6ed1bfd..c939cd222730 100644
> --- a/include/linux/backing-file.h
> +++ b/include/linux/backing-file.h
> @@ -18,10 +18,10 @@ struct backing_file_ctx {
>         void (*end_write)(struct kiocb *iocb, ssize_t);
>  };
>
> -struct file *backing_file_open(const struct path *user_path, int flags,
> +struct file *backing_file_open(const struct file *user_file, int flags,
>                                const struct path *real_path,
>                                const struct cred *cred);
> -struct file *backing_tmpfile_open(const struct path *user_path, int flags,
> +struct file *backing_tmpfile_open(const struct file *user_file, int flags,
>                                   const struct path *real_parentpath,
>                                   umode_t mode, const struct cred *cred);
>  ssize_t backing_file_read_iter(struct file *file, struct iov_iter *iter,
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 8b3dd145b25e..d0d0e8f55589 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2475,6 +2475,19 @@ struct file *dentry_create(struct path *path, int flags, umode_t mode,
>                            const struct cred *cred);
>  const struct path *backing_file_user_path(const struct file *f);
>
> +#ifdef CONFIG_SECURITY
> +void *backing_file_security(const struct file *f);
> +void backing_file_set_security(struct file *f, void *security);
> +#else
> +static inline void *backing_file_security(const struct file *f)
> +{
> +       return NULL;
> +}
> +static inline void backing_file_set_security(struct file *f, void *security)
> +{
> +}
> +#endif /* CONFIG_SECURITY */
> +
>  /*
>   * When mmapping a file on a stackable filesystem (e.g., overlayfs), the file
>   * stored in ->vm_file is a backing file whose f_inode is on the underlying
> diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
> index 382c56a97bba..584db296e43b 100644
> --- a/include/linux/lsm_audit.h
> +++ b/include/linux/lsm_audit.h
> @@ -94,7 +94,7 @@ struct common_audit_data {
>  #endif
>                 char *kmod_name;
>                 struct lsm_ioctlop_audit *op;
> -               struct file *file;
> +               const struct file *file;
>                 struct lsm_ibpkey_audit *ibpkey;
>                 struct lsm_ibendport_audit *ibendport;
>                 int reason;
> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> index 8c42b4bde09c..b4958167e381 100644
> --- a/include/linux/lsm_hook_defs.h
> +++ b/include/linux/lsm_hook_defs.h
> @@ -191,6 +191,9 @@ LSM_HOOK(int, 0, file_permission, struct file *file, int mask)
>  LSM_HOOK(int, 0, file_alloc_security, struct file *file)
>  LSM_HOOK(void, LSM_RET_VOID, file_release, struct file *file)
>  LSM_HOOK(void, LSM_RET_VOID, file_free_security, struct file *file)
> +LSM_HOOK(int, 0, backing_file_alloc, struct file *backing_file,
> +        const struct file *user_file)
> +LSM_HOOK(void, LSM_RET_VOID, backing_file_free, struct file *backing_file)
>  LSM_HOOK(int, 0, file_ioctl, struct file *file, unsigned int cmd,
>          unsigned long arg)
>  LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
> @@ -198,6 +201,8 @@ LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
>  LSM_HOOK(int, 0, mmap_addr, unsigned long addr)
>  LSM_HOOK(int, 0, mmap_file, struct file *file, unsigned long reqprot,
>          unsigned long prot, unsigned long flags)
> +LSM_HOOK(int, 0, mmap_backing_file, struct vm_area_struct *vma,
> +        struct file *backing_file, struct file *user_file)
>  LSM_HOOK(int, 0, file_mprotect, struct vm_area_struct *vma,
>          unsigned long reqprot, unsigned long prot)
>  LSM_HOOK(int, 0, file_lock, struct file *file, unsigned int cmd)
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index d48bf0ad26f4..b4f8cad53ddb 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -104,6 +104,7 @@ struct security_hook_list {
>  struct lsm_blob_sizes {
>         unsigned int lbs_cred;
>         unsigned int lbs_file;
> +       unsigned int lbs_backing_file;
>         unsigned int lbs_ib;
>         unsigned int lbs_inode;
>         unsigned int lbs_sock;
> diff --git a/include/linux/security.h b/include/linux/security.h
> index ee88dd2d2d1f..8d2d4856934e 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -472,11 +472,17 @@ int security_file_permission(struct file *file, int mask);
>  int security_file_alloc(struct file *file);
>  void security_file_release(struct file *file);
>  void security_file_free(struct file *file);
> +int security_backing_file_alloc(struct file *backing_file,
> +                               const struct file *user_file);
> +void security_backing_file_free(struct file *backing_file);
>  int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
>  int security_file_ioctl_compat(struct file *file, unsigned int cmd,
>                                unsigned long arg);
>  int security_mmap_file(struct file *file, unsigned long prot,
>                         unsigned long flags);
> +int security_mmap_backing_file(struct vm_area_struct *vma,
> +                              struct file *backing_file,
> +                              struct file *user_file);
>  int security_mmap_addr(unsigned long addr);
>  int security_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
>                            unsigned long prot);
> @@ -1141,6 +1147,15 @@ static inline void security_file_release(struct file *file)
>  static inline void security_file_free(struct file *file)
>  { }
>
> +static inline int security_backing_file_alloc(struct file *backing_file,
> +                                             const struct file *user_file)
> +{
> +       return 0;
> +}
> +
> +static inline void security_backing_file_free(struct file *backing_file)
> +{ }
> +
>  static inline int security_file_ioctl(struct file *file, unsigned int cmd,
>                                       unsigned long arg)
>  {
> @@ -1160,6 +1175,13 @@ static inline int security_mmap_file(struct file *file, unsigned long prot,
>         return 0;
>  }
>
> +static inline int security_mmap_backing_file(struct vm_area_struct *vma,
> +                                            struct file *backing_file,
> +                                            struct file *user_file)
> +{
> +       return 0;
> +}
> +
>  static inline int security_mmap_addr(unsigned long addr)
>  {
>         return cap_mmap_addr(addr);
> diff --git a/security/lsm.h b/security/lsm.h
> index db77cc83e158..32f808ad4335 100644
> --- a/security/lsm.h
> +++ b/security/lsm.h
> @@ -29,6 +29,7 @@ extern struct lsm_blob_sizes blob_sizes;
>
>  /* LSM blob caches */
>  extern struct kmem_cache *lsm_file_cache;
> +extern struct kmem_cache *lsm_backing_file_cache;
>  extern struct kmem_cache *lsm_inode_cache;
>
>  /* LSM blob allocators */
> diff --git a/security/lsm_init.c b/security/lsm_init.c
> index 573e2a7250c4..7c0fd17f1601 100644
> --- a/security/lsm_init.c
> +++ b/security/lsm_init.c
> @@ -293,6 +293,8 @@ static void __init lsm_prepare(struct lsm_info *lsm)
>         blobs = lsm->blobs;
>         lsm_blob_size_update(&blobs->lbs_cred, &blob_sizes.lbs_cred);
>         lsm_blob_size_update(&blobs->lbs_file, &blob_sizes.lbs_file);
> +       lsm_blob_size_update(&blobs->lbs_backing_file,
> +                            &blob_sizes.lbs_backing_file);
>         lsm_blob_size_update(&blobs->lbs_ib, &blob_sizes.lbs_ib);
>         /* inode blob gets an rcu_head in addition to LSM blobs. */
>         if (blobs->lbs_inode && blob_sizes.lbs_inode == 0)
> @@ -441,6 +443,8 @@ int __init security_init(void)
>         if (lsm_debug) {
>                 lsm_pr("blob(cred) size %d\n", blob_sizes.lbs_cred);
>                 lsm_pr("blob(file) size %d\n", blob_sizes.lbs_file);
> +               lsm_pr("blob(backing_file) size %d\n",
> +                      blob_sizes.lbs_backing_file);
>                 lsm_pr("blob(ib) size %d\n", blob_sizes.lbs_ib);
>                 lsm_pr("blob(inode) size %d\n", blob_sizes.lbs_inode);
>                 lsm_pr("blob(ipc) size %d\n", blob_sizes.lbs_ipc);
> @@ -462,6 +466,11 @@ int __init security_init(void)
>                 lsm_file_cache = kmem_cache_create("lsm_file_cache",
>                                                    blob_sizes.lbs_file, 0,
>                                                    SLAB_PANIC, NULL);
> +       if (blob_sizes.lbs_backing_file)
> +               lsm_backing_file_cache = kmem_cache_create(
> +                                                  "lsm_backing_file_cache",
> +                                                  blob_sizes.lbs_backing_file,
> +                                                  0, SLAB_PANIC, NULL);
>         if (blob_sizes.lbs_inode)
>                 lsm_inode_cache = kmem_cache_create("lsm_inode_cache",
>                                                     blob_sizes.lbs_inode, 0,
> diff --git a/security/security.c b/security/security.c
> index a26c1474e2e4..048560ef6a1a 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -82,6 +82,7 @@ const struct lsm_id *lsm_idlist[MAX_LSM_COUNT];
>  struct lsm_blob_sizes blob_sizes;
>
>  struct kmem_cache *lsm_file_cache;
> +struct kmem_cache *lsm_backing_file_cache;
>  struct kmem_cache *lsm_inode_cache;
>
>  #define SECURITY_HOOK_ACTIVE_KEY(HOOK, IDX) security_hook_active_##HOOK##_##IDX
> @@ -173,6 +174,30 @@ static int lsm_file_alloc(struct file *file)
>         return 0;
>  }
>
> +/**
> + * lsm_backing_file_alloc - allocate a composite backing file blob
> + * @backing_file: the backing file
> + *
> + * Allocate the backing file blob for all the modules.
> + *
> + * Returns 0, or -ENOMEM if memory can't be allocated.
> + */
> +static int lsm_backing_file_alloc(struct file *backing_file)
> +{
> +       void *blob;
> +
> +       if (!lsm_backing_file_cache) {
> +               backing_file_set_security(backing_file, NULL);
> +               return 0;
> +       }
> +
> +       blob = kmem_cache_zalloc(lsm_backing_file_cache, GFP_KERNEL);
> +       backing_file_set_security(backing_file, blob);
> +       if (!blob)
> +               return -ENOMEM;
> +       return 0;
> +}
> +
>  /**
>   * lsm_blob_alloc - allocate a composite blob
>   * @dest: the destination for the blob
> @@ -2418,6 +2443,57 @@ void security_file_free(struct file *file)
>         }
>  }
>
> +/**
> + * security_backing_file_alloc() - Allocate and setup a backing file blob
> + * @backing_file: the backing file
> + * @user_file: the associated user visible file
> + *
> + * Allocate a backing file LSM blob and perform any necessary initialization of
> + * the LSM blob.  There will be some operations where the LSM will not have
> + * access to @user_file after this point, so any important state associated
> + * with @user_file that is important to the LSM should be captured in the
> + * backing file's LSM blob.
> + *
> + * LSM's should avoid taking a reference to @user_file in this hook as it will
> + * result in problems later when the system attempts to drop/put the file
> + * references due to a circular dependency.
> + *
> + * Return: Return 0 if the hook is successful, negative values otherwise.
> + */
> +int security_backing_file_alloc(struct file *backing_file,
> +                               const struct file *user_file)
> +{
> +       int rc;
> +
> +       rc = lsm_backing_file_alloc(backing_file);
> +       if (rc)
> +               return rc;
> +       rc = call_int_hook(backing_file_alloc, backing_file, user_file);
> +       if (unlikely(rc))
> +               security_backing_file_free(backing_file);
> +
> +       return rc;
> +}
> +
> +/**
> + * security_backing_file_free() - Free a backing file blob
> + * @backing_file: the backing file
> + *
> + * Free any LSM state associate with a backing file's LSM blob, including the
> + * blob itself.
> + */
> +void security_backing_file_free(struct file *backing_file)
> +{
> +       void *blob = backing_file_security(backing_file);
> +
> +       call_void_hook(backing_file_free, backing_file);
> +
> +       if (blob) {
> +               backing_file_set_security(backing_file, NULL);
> +               kmem_cache_free(lsm_backing_file_cache, blob);
> +       }
> +}
> +
>  /**
>   * security_file_ioctl() - Check if an ioctl is allowed
>   * @file: associated file
> @@ -2506,6 +2582,32 @@ int security_mmap_file(struct file *file, unsigned long prot,
>                              flags);
>  }
>
> +/**
> + * security_mmap_backing_file - Check if mmap'ing a backing file is allowed
> + * @vma: the vm_area_struct for the mmap'd region
> + * @backing_file: the backing file being mmap'd
> + * @user_file: the user file being mmap'd
> + *
> + * Check permissions for a mmap operation on a stacked filesystem.  This hook
> + * is called after the security_mmap_file() and is responsible for authorizing
> + * the mmap on @backing_file.  It is important to note that the mmap operation
> + * on @user_file has already been authorized and the @vma->vm_file has been
> + * set to @backing_file.
> + *
> + * Return: Returns 0 if permission is granted.
> + */
> +int security_mmap_backing_file(struct vm_area_struct *vma,
> +                              struct file *backing_file,
> +                              struct file *user_file)
> +{
> +       /* recommended by the stackable filesystem devs */
> +       if (WARN_ON_ONCE(!(backing_file->f_mode & FMODE_BACKING)))
> +               return -EIO;
> +
> +       return call_int_hook(mmap_backing_file, vma, backing_file, user_file);
> +}
> +EXPORT_SYMBOL_GPL(security_mmap_backing_file);
> +
>  /**
>   * security_mmap_addr() - Check if mmap'ing an address is allowed
>   * @addr: address
> --
> 2.53.0
>

^ permalink raw reply

* [PATCH] apparmor: Fix two bugs of aa_setup_dfa_engine's fail handling
From: GONG Ruiqi @ 2026-04-03  3:51 UTC (permalink / raw)
  To: John Johansen, Paul Moore, James Morris, Serge E . Hallyn
  Cc: apparmor, linux-security-module, linux-kernel, lujialin4,
	gongruiqi1

First, aa_dfa_unpack returns ERR_PTR not NULL when it fails, but
aa_put_dfa only checks NULL for its input, which would cause invalid
memory access in aa_put_dfa. Set nulldfa to NULL explicitly to fix that.

Second, aa_put_pdb calls aa_pdb_free_kref -> aa_free_pdb -> aa_put_dfa,
i.e.  it will free nullpdb->dfa. But there's another aa_put_dfa(nulldfa)
after aa_put_pdb(nullpdb), which would cause double free. Remove that
redundant aa_put_dfa to fix that.

Fixes: 98b824ff8984 ("apparmor: refcount the pdb")
Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com>
---
 security/apparmor/lsm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index c1d42fc72fdb..be82ec1b9fd9 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -2465,6 +2465,7 @@ static int __init aa_setup_dfa_engine(void)
 			    TO_ACCEPT2_FLAG(YYTD_DATA32));
 	if (IS_ERR(nulldfa)) {
 		error = PTR_ERR(nulldfa);
+		nulldfa = NULL;
 		goto fail;
 	}
 	nullpdb->dfa = aa_get_dfa(nulldfa);
@@ -2486,7 +2487,6 @@ static int __init aa_setup_dfa_engine(void)
 
 fail:
 	aa_put_pdb(nullpdb);
-	aa_put_dfa(nulldfa);
 	nullpdb = NULL;
 	nulldfa = NULL;
 	stacksplitdfa = NULL;
-- 
2.43.0


^ permalink raw reply related

* [PATCH v4 3/3] selinux: fix overlayfs mmap() and mprotect() access checks
From: Paul Moore @ 2026-04-03  3:08 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs
  Cc: Amir Goldstein, Gao Xiang, Christian Brauner
In-Reply-To: <20260403030848.731867-5-paul@paul-moore.com>

The existing SELinux security model for overlayfs is to allow access if
the current task is able to access the top level file (the "user" file)
and the mounter's credentials are sufficient to access the lower
level file (the "backing" file).  Unfortunately, the current code does
not properly enforce these access controls for both mmap() and mprotect()
operations on overlayfs filesystems.

This patch makes use of the newly created security_mmap_backing_file()
LSM hook to provide the missing backing file enforcement for mmap()
operations, and leverages the backing file API and new LSM blob to
provide the necessary information to properly enforce the mprotect()
access controls.

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <paul@paul-moore.com>
---
 security/selinux/hooks.c          | 256 +++++++++++++++++++++---------
 security/selinux/include/objsec.h |  11 ++
 2 files changed, 196 insertions(+), 71 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index d8224ea113d1..76e0fb7dcb36 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1745,6 +1745,60 @@ static inline int file_path_has_perm(const struct cred *cred,
 static int bpf_fd_pass(const struct file *file, u32 sid);
 #endif
 
+static int __file_has_perm(const struct cred *cred, const struct file *file,
+			   u32 av, bool bf_user_file)
+
+{
+	struct common_audit_data ad;
+	struct inode *inode;
+	u32 ssid = cred_sid(cred);
+	u32 tsid_fd;
+	int rc;
+
+	if (bf_user_file) {
+		struct backing_file_security_struct *bfsec;
+		const struct path *path;
+
+		if (WARN_ON(!(file->f_mode & FMODE_BACKING)))
+			return -EIO;
+
+		bfsec = selinux_backing_file(file);
+		path = backing_file_user_path(file);
+		tsid_fd = bfsec->uf_sid;
+		inode = d_inode(path->dentry);
+
+		ad.type = LSM_AUDIT_DATA_PATH;
+		ad.u.path = *path;
+	} else {
+		struct file_security_struct *fsec = selinux_file(file);
+
+		tsid_fd = fsec->sid;
+		inode = file_inode(file);
+
+		ad.type = LSM_AUDIT_DATA_FILE;
+		ad.u.file = file;
+	}
+
+	if (ssid != tsid_fd) {
+		rc = avc_has_perm(ssid, tsid_fd, SECCLASS_FD, FD__USE, &ad);
+		if (rc)
+			return rc;
+	}
+
+#ifdef CONFIG_BPF_SYSCALL
+	/* regardless of backing vs user file, use the underlying file here */
+	rc = bpf_fd_pass(file, ssid);
+	if (rc)
+		return rc;
+#endif
+
+	/* av is zero if only checking access to the descriptor. */
+	if (av)
+		return inode_has_perm(cred, inode, av, &ad);
+
+	return 0;
+}
+
 /* Check whether a task can use an open file descriptor to
    access an inode in a given way.  Check access to the
    descriptor itself, and then use dentry_has_perm to
@@ -1753,41 +1807,10 @@ static int bpf_fd_pass(const struct file *file, u32 sid);
    has the same SID as the process.  If av is zero, then
    access to the file is not checked, e.g. for cases
    where only the descriptor is affected like seek. */
-static int file_has_perm(const struct cred *cred,
-			 struct file *file,
-			 u32 av)
+static inline int file_has_perm(const struct cred *cred,
+				const struct file *file, u32 av)
 {
-	struct file_security_struct *fsec = selinux_file(file);
-	struct inode *inode = file_inode(file);
-	struct common_audit_data ad;
-	u32 sid = cred_sid(cred);
-	int rc;
-
-	ad.type = LSM_AUDIT_DATA_FILE;
-	ad.u.file = file;
-
-	if (sid != fsec->sid) {
-		rc = avc_has_perm(sid, fsec->sid,
-				  SECCLASS_FD,
-				  FD__USE,
-				  &ad);
-		if (rc)
-			goto out;
-	}
-
-#ifdef CONFIG_BPF_SYSCALL
-	rc = bpf_fd_pass(file, cred_sid(cred));
-	if (rc)
-		return rc;
-#endif
-
-	/* av is zero if only checking access to the descriptor. */
-	rc = 0;
-	if (av)
-		rc = inode_has_perm(cred, inode, av, &ad);
-
-out:
-	return rc;
+	return __file_has_perm(cred, file, av, false);
 }
 
 /*
@@ -3825,6 +3848,17 @@ static int selinux_file_alloc_security(struct file *file)
 	return 0;
 }
 
+static int selinux_backing_file_alloc(struct file *backing_file,
+				      const struct file *user_file)
+{
+	struct backing_file_security_struct *bfsec;
+
+	bfsec = selinux_backing_file(backing_file);
+	bfsec->uf_sid = selinux_file(user_file)->sid;
+
+	return 0;
+}
+
 /*
  * Check whether a task has the ioctl permission and cmd
  * operation to an inode.
@@ -3942,42 +3976,55 @@ static int selinux_file_ioctl_compat(struct file *file, unsigned int cmd,
 
 static int default_noexec __ro_after_init;
 
-static int file_map_prot_check(struct file *file, unsigned long prot, int shared)
+static int __file_map_prot_check(const struct cred *cred,
+				 const struct file *file, unsigned long prot,
+				 bool shared, bool bf_user_file)
 {
-	const struct cred *cred = current_cred();
-	u32 sid = cred_sid(cred);
-	int rc = 0;
+	struct inode *inode = NULL;
+	bool prot_exec = prot & PROT_EXEC;
+	bool prot_write = prot & PROT_WRITE;
+
+	if (file) {
+		if (bf_user_file)
+			inode = d_inode(backing_file_user_path(file)->dentry);
+		else
+			inode = file_inode(file);
+	}
+
+	if (default_noexec && prot_exec &&
+	    (!file || IS_PRIVATE(inode) || (!shared && prot_write))) {
+		int rc;
+		u32 sid = cred_sid(cred);
 
-	if (default_noexec &&
-	    (prot & PROT_EXEC) && (!file || IS_PRIVATE(file_inode(file)) ||
-				   (!shared && (prot & PROT_WRITE)))) {
 		/*
-		 * We are making executable an anonymous mapping or a
-		 * private file mapping that will also be writable.
-		 * This has an additional check.
+		 * We are making executable an anonymous mapping or a private
+		 * file mapping that will also be writable.
 		 */
-		rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
-				  PROCESS__EXECMEM, NULL);
+		rc = avc_has_perm(sid, sid, SECCLASS_PROCESS, PROCESS__EXECMEM,
+				  NULL);
 		if (rc)
-			goto error;
+			return rc;
 	}
 
 	if (file) {
-		/* read access is always possible with a mapping */
+		/* "read" always possible, "write" only if shared */
 		u32 av = FILE__READ;
-
-		/* write access only matters if the mapping is shared */
-		if (shared && (prot & PROT_WRITE))
+		if (shared && prot_write)
 			av |= FILE__WRITE;
-
-		if (prot & PROT_EXEC)
+		if (prot_exec)
 			av |= FILE__EXECUTE;
 
-		return file_has_perm(cred, file, av);
+		return __file_has_perm(cred, file, av, bf_user_file);
 	}
 
-error:
-	return rc;
+	return 0;
+}
+
+static inline int file_map_prot_check(const struct cred *cred,
+				      const struct file *file,
+				      unsigned long prot, bool shared)
+{
+	return __file_map_prot_check(cred, file, prot, shared, false);
 }
 
 static int selinux_mmap_addr(unsigned long addr)
@@ -3993,36 +4040,80 @@ static int selinux_mmap_addr(unsigned long addr)
 	return rc;
 }
 
-static int selinux_mmap_file(struct file *file,
-			     unsigned long reqprot __always_unused,
-			     unsigned long prot, unsigned long flags)
+static int selinux_mmap_file_common(const struct cred *cred, struct file *file,
+				    unsigned long prot, bool shared)
 {
-	struct common_audit_data ad;
-	int rc;
-
 	if (file) {
+		int rc;
+		struct common_audit_data ad;
+
 		ad.type = LSM_AUDIT_DATA_FILE;
 		ad.u.file = file;
-		rc = inode_has_perm(current_cred(), file_inode(file),
-				    FILE__MAP, &ad);
+		rc = inode_has_perm(cred, file_inode(file), FILE__MAP, &ad);
 		if (rc)
 			return rc;
 	}
 
-	return file_map_prot_check(file, prot,
-				   (flags & MAP_TYPE) == MAP_SHARED);
+	return file_map_prot_check(cred, file, prot, shared);
+}
+
+static int selinux_mmap_file(struct file *file,
+			     unsigned long reqprot __always_unused,
+			     unsigned long prot, unsigned long flags)
+{
+	return selinux_mmap_file_common(current_cred(), file, prot,
+					(flags & MAP_TYPE) == MAP_SHARED);
+}
+
+/**
+ * selinux_mmap_backing_file - Check mmap permissions on a backing file
+ * @vma: memory region
+ * @backing_file: stacked filesystem backing file
+ * @user_file: user visible file
+ *
+ * This is called after selinux_mmap_file() on stacked filesystems, and it
+ * is this function's responsibility to verify access to @backing_file and
+ * setup the SELinux state for possible later use in the mprotect() code path.
+ *
+ * By the time this function is called, mmap() access to @user_file has already
+ * been authorized and @vma->vm_file has been set to point to @backing_file.
+ *
+ * Return zero on success, negative values otherwise.
+ */
+static int selinux_mmap_backing_file(struct vm_area_struct *vma,
+				     struct file *backing_file,
+				     struct file *user_file __always_unused)
+{
+	unsigned long prot = 0;
+
+	/* translate vma->vm_flags perms into PROT perms */
+	if (vma->vm_flags & VM_READ)
+		prot |= PROT_READ;
+	if (vma->vm_flags & VM_WRITE)
+		prot |= PROT_WRITE;
+	if (vma->vm_flags & VM_EXEC)
+		prot |= PROT_EXEC;
+
+	return selinux_mmap_file_common(backing_file->f_cred, backing_file,
+					prot, vma->vm_flags & VM_SHARED);
 }
 
 static int selinux_file_mprotect(struct vm_area_struct *vma,
 				 unsigned long reqprot __always_unused,
 				 unsigned long prot)
 {
+	int rc;
 	const struct cred *cred = current_cred();
 	u32 sid = cred_sid(cred);
+	const struct file *file = vma->vm_file;
+	bool backing_file;
+	bool shared = vma->vm_flags & VM_SHARED;
+
+	/* check if we need to trigger the "backing files are awful" mode */
+	backing_file = file && (file->f_mode & FMODE_BACKING);
 
 	if (default_noexec &&
 	    (prot & PROT_EXEC) && !(vma->vm_flags & VM_EXEC)) {
-		int rc = 0;
 		/*
 		 * We don't use the vma_is_initial_heap() helper as it has
 		 * a history of problems and is currently broken on systems
@@ -4036,11 +4127,15 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 		    vma->vm_end <= vma->vm_mm->brk) {
 			rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECHEAP, NULL);
-		} else if (!vma->vm_file && (vma_is_initial_stack(vma) ||
+			if (rc)
+				return rc;
+		} else if (!file && (vma_is_initial_stack(vma) ||
 			    vma_is_stack_for_current(vma))) {
 			rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECSTACK, NULL);
-		} else if (vma->vm_file && vma->anon_vma) {
+			if (rc)
+				return rc;
+		} else if (file && vma->anon_vma) {
 			/*
 			 * We are making executable a file mapping that has
 			 * had some COW done. Since pages might have been
@@ -4048,13 +4143,29 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 			 * modified content.  This typically should only
 			 * occur for text relocations.
 			 */
-			rc = file_has_perm(cred, vma->vm_file, FILE__EXECMOD);
+			rc = __file_has_perm(cred, file, FILE__EXECMOD,
+					     backing_file);
+			if (rc)
+				return rc;
+			if (backing_file) {
+				rc = file_has_perm(file->f_cred, file,
+						   FILE__EXECMOD);
+				if (rc)
+					return rc;
+			}
 		}
+	}
+
+	rc = __file_map_prot_check(cred, file, prot, shared, backing_file);
+	if (rc)
+		return rc;
+	if (backing_file) {
+		rc = file_map_prot_check(file->f_cred, file, prot, shared);
 		if (rc)
 			return rc;
 	}
 
-	return file_map_prot_check(vma->vm_file, prot, vma->vm_flags&VM_SHARED);
+	return 0;
 }
 
 static int selinux_file_lock(struct file *file, unsigned int cmd)
@@ -7393,6 +7504,7 @@ struct lsm_blob_sizes selinux_blob_sizes __ro_after_init = {
 	.lbs_cred = sizeof(struct cred_security_struct),
 	.lbs_task = sizeof(struct task_security_struct),
 	.lbs_file = sizeof(struct file_security_struct),
+	.lbs_backing_file = sizeof(struct backing_file_security_struct),
 	.lbs_inode = sizeof(struct inode_security_struct),
 	.lbs_ipc = sizeof(struct ipc_security_struct),
 	.lbs_key = sizeof(struct key_security_struct),
@@ -7498,9 +7610,11 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = {
 
 	LSM_HOOK_INIT(file_permission, selinux_file_permission),
 	LSM_HOOK_INIT(file_alloc_security, selinux_file_alloc_security),
+	LSM_HOOK_INIT(backing_file_alloc, selinux_backing_file_alloc),
 	LSM_HOOK_INIT(file_ioctl, selinux_file_ioctl),
 	LSM_HOOK_INIT(file_ioctl_compat, selinux_file_ioctl_compat),
 	LSM_HOOK_INIT(mmap_file, selinux_mmap_file),
+	LSM_HOOK_INIT(mmap_backing_file, selinux_mmap_backing_file),
 	LSM_HOOK_INIT(mmap_addr, selinux_mmap_addr),
 	LSM_HOOK_INIT(file_mprotect, selinux_file_mprotect),
 	LSM_HOOK_INIT(file_lock, selinux_file_lock),
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index 5bddd28ea5cb..b19e5d978e82 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -88,6 +88,10 @@ struct file_security_struct {
 	u32 pseqno; /* Policy seqno at the time of file open */
 };
 
+struct backing_file_security_struct {
+	u32 uf_sid; /* associated user file fsec->sid */
+};
+
 struct superblock_security_struct {
 	u32 sid; /* SID of file system superblock */
 	u32 def_sid; /* default SID for labeling */
@@ -195,6 +199,13 @@ static inline struct file_security_struct *selinux_file(const struct file *file)
 	return file->f_security + selinux_blob_sizes.lbs_file;
 }
 
+static inline struct backing_file_security_struct *
+selinux_backing_file(const struct file *backing_file)
+{
+	void *blob = backing_file_security(backing_file);
+	return blob + selinux_blob_sizes.lbs_backing_file;
+}
+
 static inline struct inode_security_struct *
 selinux_inode(const struct inode *inode)
 {
-- 
2.53.0


^ permalink raw reply related

* [PATCH v4 2/3] lsm: add backing_file LSM hooks
From: Paul Moore @ 2026-04-03  3:08 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs
  Cc: Amir Goldstein, Gao Xiang, Christian Brauner
In-Reply-To: <20260403030848.731867-5-paul@paul-moore.com>

Stacked filesystems such as overlayfs do not currently provide the
necessary mechanisms for LSMs to properly enforce access controls on the
mmap() and mprotect() operations.  In order to resolve this gap, a LSM
security blob is being added to the backing_file struct and the following
new LSM hooks are being created:

 security_backing_file_alloc()
 security_backing_file_free()
 security_mmap_backing_file()

The first two hooks are to manage the lifecycle of the LSM security blob
in the backing_file struct, while the third provides a new mmap() access
control point for the underlying backing file.  It is also expected that
LSMs will likely want to update their security_file_mprotect() callback
to address issues with their mprotect() controls, but that does not
require a change to the security_file_mprotect() LSM hook.

There are a three other small changes to support these new LSM hooks:
* Pass the user file associated with a backing file down to
alloc_empty_backing_file() so it can be included in the
security_backing_file_alloc() hook.
* Add getter and setter functions for the backing_file struct LSM blob
as the backing_file struct remains private to fs/file_table.c.
* Constify the file struct field in the LSM common_audit_data struct to
better support LSMs that need to pass a const file struct pointer into
the common LSM audit code.

Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL()
and supplying a fixup.

Cc: stable@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-unionfs@vger.kernel.org
Cc: linux-erofs@lists.ozlabs.org
Signed-off-by: Paul Moore <paul@paul-moore.com>
---
 fs/backing-file.c             |  18 ++++--
 fs/erofs/ishare.c             |  10 +++-
 fs/file_table.c               |  27 +++++++--
 fs/fuse/passthrough.c         |   2 +-
 fs/internal.h                 |   3 +-
 fs/overlayfs/dir.c            |   2 +-
 fs/overlayfs/file.c           |   2 +-
 include/linux/backing-file.h  |   4 +-
 include/linux/fs.h            |  13 +++++
 include/linux/lsm_audit.h     |   2 +-
 include/linux/lsm_hook_defs.h |   5 ++
 include/linux/lsm_hooks.h     |   1 +
 include/linux/security.h      |  22 ++++++++
 security/lsm.h                |   1 +
 security/lsm_init.c           |   9 +++
 security/security.c           | 102 ++++++++++++++++++++++++++++++++++
 16 files changed, 206 insertions(+), 17 deletions(-)

diff --git a/fs/backing-file.c b/fs/backing-file.c
index 45da8600d564..1f3bbfc75882 100644
--- a/fs/backing-file.c
+++ b/fs/backing-file.c
@@ -12,6 +12,7 @@
 #include <linux/backing-file.h>
 #include <linux/splice.h>
 #include <linux/mm.h>
+#include <linux/security.h>
 
 #include "internal.h"
 
@@ -29,14 +30,15 @@
  * returned file into a container structure that also stores the stacked
  * file's path, which can be retrieved using backing_file_user_path().
  */
-struct file *backing_file_open(const struct path *user_path, int flags,
+struct file *backing_file_open(const struct file *user_file, int flags,
 			       const struct path *real_path,
 			       const struct cred *cred)
 {
+	const struct path *user_path = &user_file->f_path;
 	struct file *f;
 	int error;
 
-	f = alloc_empty_backing_file(flags, cred);
+	f = alloc_empty_backing_file(flags, cred, user_file);
 	if (IS_ERR(f))
 		return f;
 
@@ -52,15 +54,16 @@ struct file *backing_file_open(const struct path *user_path, int flags,
 }
 EXPORT_SYMBOL_GPL(backing_file_open);
 
-struct file *backing_tmpfile_open(const struct path *user_path, int flags,
+struct file *backing_tmpfile_open(const struct file *user_file, int flags,
 				  const struct path *real_parentpath,
 				  umode_t mode, const struct cred *cred)
 {
 	struct mnt_idmap *real_idmap = mnt_idmap(real_parentpath->mnt);
+	const struct path *user_path = &user_file->f_path;
 	struct file *f;
 	int error;
 
-	f = alloc_empty_backing_file(flags, cred);
+	f = alloc_empty_backing_file(flags, cred, user_file);
 	if (IS_ERR(f))
 		return f;
 
@@ -336,8 +339,13 @@ int backing_file_mmap(struct file *file, struct vm_area_struct *vma,
 
 	vma_set_file(vma, file);
 
-	scoped_with_creds(ctx->cred)
+	scoped_with_creds(ctx->cred) {
+		ret = security_mmap_backing_file(vma, file, user_file);
+		if (ret)
+			return ret;
+
 		ret = vfs_mmap(vma->vm_file, vma);
+	}
 
 	if (ctx->accessed)
 		ctx->accessed(user_file);
diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
index ec433bacc592..6ed66b17359b 100644
--- a/fs/erofs/ishare.c
+++ b/fs/erofs/ishare.c
@@ -4,6 +4,7 @@
  */
 #include <linux/xxhash.h>
 #include <linux/mount.h>
+#include <linux/security.h>
 #include "internal.h"
 #include "xattr.h"
 
@@ -106,7 +107,8 @@ static int erofs_ishare_file_open(struct inode *inode, struct file *file)
 
 	if (file->f_flags & O_DIRECT)
 		return -EINVAL;
-	realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred());
+	realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred(),
+					    file);
 	if (IS_ERR(realfile))
 		return PTR_ERR(realfile);
 	ihold(sharedinode);
@@ -150,8 +152,14 @@ static ssize_t erofs_ishare_file_read_iter(struct kiocb *iocb,
 static int erofs_ishare_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct file *realfile = file->private_data;
+	int err;
 
 	vma_set_file(vma, realfile);
+
+	err = security_mmap_backing_file(vma, realfile, file);
+	if (err)
+		return err;
+
 	return generic_file_readonly_mmap(file, vma);
 }
 
diff --git a/fs/file_table.c b/fs/file_table.c
index 3b3792903185..d19d879b6efc 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -50,6 +50,9 @@ struct backing_file {
 		struct path user_path;
 		freeptr_t bf_freeptr;
 	};
+#ifdef CONFIG_SECURITY
+	void *security;
+#endif
 };
 
 #define backing_file(f) container_of(f, struct backing_file, file)
@@ -66,8 +69,21 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
 }
 EXPORT_SYMBOL_GPL(backing_file_set_user_path);
 
+#ifdef CONFIG_SECURITY
+void *backing_file_security(const struct file *f)
+{
+	return backing_file(f)->security;
+}
+
+void backing_file_set_security(struct file *f, void *security)
+{
+	backing_file(f)->security = security;
+}
+#endif /* CONFIG_SECURITY */
+
 static inline void backing_file_free(struct backing_file *ff)
 {
+	security_backing_file_free(&ff->file);
 	path_put(&ff->user_path);
 	kmem_cache_free(bfilp_cachep, ff);
 }
@@ -288,10 +304,12 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
 	return f;
 }
 
-static int init_backing_file(struct backing_file *ff)
+static int init_backing_file(struct backing_file *ff,
+			     const struct file *user_file)
 {
 	memset(&ff->user_path, 0, sizeof(ff->user_path));
-	return 0;
+	backing_file_set_security(&ff->file, NULL);
+	return security_backing_file_alloc(&ff->file, user_file);
 }
 
 /*
@@ -301,7 +319,8 @@ static int init_backing_file(struct backing_file *ff)
  * This is only for kernel internal use, and the allocate file must not be
  * installed into file tables or such.
  */
-struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
+struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
+				      const struct file *user_file)
 {
 	struct backing_file *ff;
 	int error;
@@ -318,7 +337,7 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
 
 	/* The f_mode flags must be set before fput(). */
 	ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
-	error = init_backing_file(ff);
+	error = init_backing_file(ff, user_file);
 	if (unlikely(error)) {
 		fput(&ff->file);
 		return ERR_PTR(error);
diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c
index 72de97c03d0e..f2d08ac2459b 100644
--- a/fs/fuse/passthrough.c
+++ b/fs/fuse/passthrough.c
@@ -167,7 +167,7 @@ struct fuse_backing *fuse_passthrough_open(struct file *file, int backing_id)
 		goto out;
 
 	/* Allocate backing file per fuse file to store fuse path */
-	backing_file = backing_file_open(&file->f_path, file->f_flags,
+	backing_file = backing_file_open(file, file->f_flags,
 					 &fb->file->f_path, fb->cred);
 	err = PTR_ERR(backing_file);
 	if (IS_ERR(backing_file)) {
diff --git a/fs/internal.h b/fs/internal.h
index cbc384a1aa09..77e90e4124e0 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -106,7 +106,8 @@ extern void chroot_fs_refs(const struct path *, const struct path *);
  */
 struct file *alloc_empty_file(int flags, const struct cred *cred);
 struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred);
-struct file *alloc_empty_backing_file(int flags, const struct cred *cred);
+struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
+				      const struct file *user_file);
 void backing_file_set_user_path(struct file *f, const struct path *path);
 
 static inline void file_put_write_access(struct file *file)
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index ff3dbd1ca61f..f2f20a611af3 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -1374,7 +1374,7 @@ static int ovl_create_tmpfile(struct file *file, struct dentry *dentry,
 				return PTR_ERR(cred);
 
 			ovl_path_upper(dentry->d_parent, &realparentpath);
-			realfile = backing_tmpfile_open(&file->f_path, flags, &realparentpath,
+			realfile = backing_tmpfile_open(file, flags, &realparentpath,
 							mode, current_cred());
 			err = PTR_ERR_OR_ZERO(realfile);
 			pr_debug("tmpfile/open(%pd2, 0%o) = %i\n", realparentpath.dentry, mode, err);
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 97bed2286030..27cc07738f33 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -48,7 +48,7 @@ static struct file *ovl_open_realfile(const struct file *file,
 			if (!inode_owner_or_capable(real_idmap, realinode))
 				flags &= ~O_NOATIME;
 
-			realfile = backing_file_open(file_user_path(file),
+			realfile = backing_file_open(file,
 						     flags, realpath, current_cred());
 		}
 	}
diff --git a/include/linux/backing-file.h b/include/linux/backing-file.h
index 1476a6ed1bfd..c939cd222730 100644
--- a/include/linux/backing-file.h
+++ b/include/linux/backing-file.h
@@ -18,10 +18,10 @@ struct backing_file_ctx {
 	void (*end_write)(struct kiocb *iocb, ssize_t);
 };
 
-struct file *backing_file_open(const struct path *user_path, int flags,
+struct file *backing_file_open(const struct file *user_file, int flags,
 			       const struct path *real_path,
 			       const struct cred *cred);
-struct file *backing_tmpfile_open(const struct path *user_path, int flags,
+struct file *backing_tmpfile_open(const struct file *user_file, int flags,
 				  const struct path *real_parentpath,
 				  umode_t mode, const struct cred *cred);
 ssize_t backing_file_read_iter(struct file *file, struct iov_iter *iter,
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8b3dd145b25e..d0d0e8f55589 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2475,6 +2475,19 @@ struct file *dentry_create(struct path *path, int flags, umode_t mode,
 			   const struct cred *cred);
 const struct path *backing_file_user_path(const struct file *f);
 
+#ifdef CONFIG_SECURITY
+void *backing_file_security(const struct file *f);
+void backing_file_set_security(struct file *f, void *security);
+#else
+static inline void *backing_file_security(const struct file *f)
+{
+	return NULL;
+}
+static inline void backing_file_set_security(struct file *f, void *security)
+{
+}
+#endif /* CONFIG_SECURITY */
+
 /*
  * When mmapping a file on a stackable filesystem (e.g., overlayfs), the file
  * stored in ->vm_file is a backing file whose f_inode is on the underlying
diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
index 382c56a97bba..584db296e43b 100644
--- a/include/linux/lsm_audit.h
+++ b/include/linux/lsm_audit.h
@@ -94,7 +94,7 @@ struct common_audit_data {
 #endif
 		char *kmod_name;
 		struct lsm_ioctlop_audit *op;
-		struct file *file;
+		const struct file *file;
 		struct lsm_ibpkey_audit *ibpkey;
 		struct lsm_ibendport_audit *ibendport;
 		int reason;
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 8c42b4bde09c..b4958167e381 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -191,6 +191,9 @@ LSM_HOOK(int, 0, file_permission, struct file *file, int mask)
 LSM_HOOK(int, 0, file_alloc_security, struct file *file)
 LSM_HOOK(void, LSM_RET_VOID, file_release, struct file *file)
 LSM_HOOK(void, LSM_RET_VOID, file_free_security, struct file *file)
+LSM_HOOK(int, 0, backing_file_alloc, struct file *backing_file,
+	 const struct file *user_file)
+LSM_HOOK(void, LSM_RET_VOID, backing_file_free, struct file *backing_file)
 LSM_HOOK(int, 0, file_ioctl, struct file *file, unsigned int cmd,
 	 unsigned long arg)
 LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
@@ -198,6 +201,8 @@ LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
 LSM_HOOK(int, 0, mmap_addr, unsigned long addr)
 LSM_HOOK(int, 0, mmap_file, struct file *file, unsigned long reqprot,
 	 unsigned long prot, unsigned long flags)
+LSM_HOOK(int, 0, mmap_backing_file, struct vm_area_struct *vma,
+	 struct file *backing_file, struct file *user_file)
 LSM_HOOK(int, 0, file_mprotect, struct vm_area_struct *vma,
 	 unsigned long reqprot, unsigned long prot)
 LSM_HOOK(int, 0, file_lock, struct file *file, unsigned int cmd)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index d48bf0ad26f4..b4f8cad53ddb 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -104,6 +104,7 @@ struct security_hook_list {
 struct lsm_blob_sizes {
 	unsigned int lbs_cred;
 	unsigned int lbs_file;
+	unsigned int lbs_backing_file;
 	unsigned int lbs_ib;
 	unsigned int lbs_inode;
 	unsigned int lbs_sock;
diff --git a/include/linux/security.h b/include/linux/security.h
index ee88dd2d2d1f..8d2d4856934e 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -472,11 +472,17 @@ int security_file_permission(struct file *file, int mask);
 int security_file_alloc(struct file *file);
 void security_file_release(struct file *file);
 void security_file_free(struct file *file);
+int security_backing_file_alloc(struct file *backing_file,
+				const struct file *user_file);
+void security_backing_file_free(struct file *backing_file);
 int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
 int security_file_ioctl_compat(struct file *file, unsigned int cmd,
 			       unsigned long arg);
 int security_mmap_file(struct file *file, unsigned long prot,
 			unsigned long flags);
+int security_mmap_backing_file(struct vm_area_struct *vma,
+			       struct file *backing_file,
+			       struct file *user_file);
 int security_mmap_addr(unsigned long addr);
 int security_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
 			   unsigned long prot);
@@ -1141,6 +1147,15 @@ static inline void security_file_release(struct file *file)
 static inline void security_file_free(struct file *file)
 { }
 
+static inline int security_backing_file_alloc(struct file *backing_file,
+					      const struct file *user_file)
+{
+	return 0;
+}
+
+static inline void security_backing_file_free(struct file *backing_file)
+{ }
+
 static inline int security_file_ioctl(struct file *file, unsigned int cmd,
 				      unsigned long arg)
 {
@@ -1160,6 +1175,13 @@ static inline int security_mmap_file(struct file *file, unsigned long prot,
 	return 0;
 }
 
+static inline int security_mmap_backing_file(struct vm_area_struct *vma,
+					     struct file *backing_file,
+					     struct file *user_file)
+{
+	return 0;
+}
+
 static inline int security_mmap_addr(unsigned long addr)
 {
 	return cap_mmap_addr(addr);
diff --git a/security/lsm.h b/security/lsm.h
index db77cc83e158..32f808ad4335 100644
--- a/security/lsm.h
+++ b/security/lsm.h
@@ -29,6 +29,7 @@ extern struct lsm_blob_sizes blob_sizes;
 
 /* LSM blob caches */
 extern struct kmem_cache *lsm_file_cache;
+extern struct kmem_cache *lsm_backing_file_cache;
 extern struct kmem_cache *lsm_inode_cache;
 
 /* LSM blob allocators */
diff --git a/security/lsm_init.c b/security/lsm_init.c
index 573e2a7250c4..7c0fd17f1601 100644
--- a/security/lsm_init.c
+++ b/security/lsm_init.c
@@ -293,6 +293,8 @@ static void __init lsm_prepare(struct lsm_info *lsm)
 	blobs = lsm->blobs;
 	lsm_blob_size_update(&blobs->lbs_cred, &blob_sizes.lbs_cred);
 	lsm_blob_size_update(&blobs->lbs_file, &blob_sizes.lbs_file);
+	lsm_blob_size_update(&blobs->lbs_backing_file,
+			     &blob_sizes.lbs_backing_file);
 	lsm_blob_size_update(&blobs->lbs_ib, &blob_sizes.lbs_ib);
 	/* inode blob gets an rcu_head in addition to LSM blobs. */
 	if (blobs->lbs_inode && blob_sizes.lbs_inode == 0)
@@ -441,6 +443,8 @@ int __init security_init(void)
 	if (lsm_debug) {
 		lsm_pr("blob(cred) size %d\n", blob_sizes.lbs_cred);
 		lsm_pr("blob(file) size %d\n", blob_sizes.lbs_file);
+		lsm_pr("blob(backing_file) size %d\n",
+		       blob_sizes.lbs_backing_file);
 		lsm_pr("blob(ib) size %d\n", blob_sizes.lbs_ib);
 		lsm_pr("blob(inode) size %d\n", blob_sizes.lbs_inode);
 		lsm_pr("blob(ipc) size %d\n", blob_sizes.lbs_ipc);
@@ -462,6 +466,11 @@ int __init security_init(void)
 		lsm_file_cache = kmem_cache_create("lsm_file_cache",
 						   blob_sizes.lbs_file, 0,
 						   SLAB_PANIC, NULL);
+	if (blob_sizes.lbs_backing_file)
+		lsm_backing_file_cache = kmem_cache_create(
+						   "lsm_backing_file_cache",
+						   blob_sizes.lbs_backing_file,
+						   0, SLAB_PANIC, NULL);
 	if (blob_sizes.lbs_inode)
 		lsm_inode_cache = kmem_cache_create("lsm_inode_cache",
 						    blob_sizes.lbs_inode, 0,
diff --git a/security/security.c b/security/security.c
index a26c1474e2e4..048560ef6a1a 100644
--- a/security/security.c
+++ b/security/security.c
@@ -82,6 +82,7 @@ const struct lsm_id *lsm_idlist[MAX_LSM_COUNT];
 struct lsm_blob_sizes blob_sizes;
 
 struct kmem_cache *lsm_file_cache;
+struct kmem_cache *lsm_backing_file_cache;
 struct kmem_cache *lsm_inode_cache;
 
 #define SECURITY_HOOK_ACTIVE_KEY(HOOK, IDX) security_hook_active_##HOOK##_##IDX
@@ -173,6 +174,30 @@ static int lsm_file_alloc(struct file *file)
 	return 0;
 }
 
+/**
+ * lsm_backing_file_alloc - allocate a composite backing file blob
+ * @backing_file: the backing file
+ *
+ * Allocate the backing file blob for all the modules.
+ *
+ * Returns 0, or -ENOMEM if memory can't be allocated.
+ */
+static int lsm_backing_file_alloc(struct file *backing_file)
+{
+	void *blob;
+
+	if (!lsm_backing_file_cache) {
+		backing_file_set_security(backing_file, NULL);
+		return 0;
+	}
+
+	blob = kmem_cache_zalloc(lsm_backing_file_cache, GFP_KERNEL);
+	backing_file_set_security(backing_file, blob);
+	if (!blob)
+		return -ENOMEM;
+	return 0;
+}
+
 /**
  * lsm_blob_alloc - allocate a composite blob
  * @dest: the destination for the blob
@@ -2418,6 +2443,57 @@ void security_file_free(struct file *file)
 	}
 }
 
+/**
+ * security_backing_file_alloc() - Allocate and setup a backing file blob
+ * @backing_file: the backing file
+ * @user_file: the associated user visible file
+ *
+ * Allocate a backing file LSM blob and perform any necessary initialization of
+ * the LSM blob.  There will be some operations where the LSM will not have
+ * access to @user_file after this point, so any important state associated
+ * with @user_file that is important to the LSM should be captured in the
+ * backing file's LSM blob.
+ *
+ * LSM's should avoid taking a reference to @user_file in this hook as it will
+ * result in problems later when the system attempts to drop/put the file
+ * references due to a circular dependency.
+ *
+ * Return: Return 0 if the hook is successful, negative values otherwise.
+ */
+int security_backing_file_alloc(struct file *backing_file,
+				const struct file *user_file)
+{
+	int rc;
+
+	rc = lsm_backing_file_alloc(backing_file);
+	if (rc)
+		return rc;
+	rc = call_int_hook(backing_file_alloc, backing_file, user_file);
+	if (unlikely(rc))
+		security_backing_file_free(backing_file);
+
+	return rc;
+}
+
+/**
+ * security_backing_file_free() - Free a backing file blob
+ * @backing_file: the backing file
+ *
+ * Free any LSM state associate with a backing file's LSM blob, including the
+ * blob itself.
+ */
+void security_backing_file_free(struct file *backing_file)
+{
+	void *blob = backing_file_security(backing_file);
+
+	call_void_hook(backing_file_free, backing_file);
+
+	if (blob) {
+		backing_file_set_security(backing_file, NULL);
+		kmem_cache_free(lsm_backing_file_cache, blob);
+	}
+}
+
 /**
  * security_file_ioctl() - Check if an ioctl is allowed
  * @file: associated file
@@ -2506,6 +2582,32 @@ int security_mmap_file(struct file *file, unsigned long prot,
 			     flags);
 }
 
+/**
+ * security_mmap_backing_file - Check if mmap'ing a backing file is allowed
+ * @vma: the vm_area_struct for the mmap'd region
+ * @backing_file: the backing file being mmap'd
+ * @user_file: the user file being mmap'd
+ *
+ * Check permissions for a mmap operation on a stacked filesystem.  This hook
+ * is called after the security_mmap_file() and is responsible for authorizing
+ * the mmap on @backing_file.  It is important to note that the mmap operation
+ * on @user_file has already been authorized and the @vma->vm_file has been
+ * set to @backing_file.
+ *
+ * Return: Returns 0 if permission is granted.
+ */
+int security_mmap_backing_file(struct vm_area_struct *vma,
+			       struct file *backing_file,
+			       struct file *user_file)
+{
+	/* recommended by the stackable filesystem devs */
+	if (WARN_ON_ONCE(!(backing_file->f_mode & FMODE_BACKING)))
+		return -EIO;
+
+	return call_int_hook(mmap_backing_file, vma, backing_file, user_file);
+}
+EXPORT_SYMBOL_GPL(security_mmap_backing_file);
+
 /**
  * security_mmap_addr() - Check if mmap'ing an address is allowed
  * @addr: address
-- 
2.53.0


^ permalink raw reply related

* [PATCH v4 1/3] fs: prepare for adding LSM blob to backing_file
From: Paul Moore @ 2026-04-03  3:08 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs
  Cc: Amir Goldstein, Gao Xiang, Christian Brauner
In-Reply-To: <20260403030848.731867-5-paul@paul-moore.com>

From: Amir Goldstein <amir73il@gmail.com>

In preparation to adding LSM blob to backing_file struct, factor out
helpers init_backing_file() and backing_file_free().

Cc: stable@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-unionfs@vger.kernel.org
Cc: linux-erofs@lists.ozlabs.org
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
[PM: use the term "LSM blob", fix comment style to match file]
Signed-off-by: Paul Moore <paul@paul-moore.com>
---
 fs/file_table.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index aaa5faaace1e..3b3792903185 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -66,6 +66,12 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
 }
 EXPORT_SYMBOL_GPL(backing_file_set_user_path);
 
+static inline void backing_file_free(struct backing_file *ff)
+{
+	path_put(&ff->user_path);
+	kmem_cache_free(bfilp_cachep, ff);
+}
+
 static inline void file_free(struct file *f)
 {
 	security_file_free(f);
@@ -73,8 +79,7 @@ static inline void file_free(struct file *f)
 		percpu_counter_dec(&nr_files);
 	put_cred(f->f_cred);
 	if (unlikely(f->f_mode & FMODE_BACKING)) {
-		path_put(backing_file_user_path(f));
-		kmem_cache_free(bfilp_cachep, backing_file(f));
+		backing_file_free(backing_file(f));
 	} else {
 		kmem_cache_free(filp_cachep, f);
 	}
@@ -283,6 +288,12 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
 	return f;
 }
 
+static int init_backing_file(struct backing_file *ff)
+{
+	memset(&ff->user_path, 0, sizeof(ff->user_path));
+	return 0;
+}
+
 /*
  * Variant of alloc_empty_file() that allocates a backing_file container
  * and doesn't check and modify nr_files.
@@ -305,7 +316,14 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
 		return ERR_PTR(error);
 	}
 
+	/* The f_mode flags must be set before fput(). */
 	ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
+	error = init_backing_file(ff);
+	if (unlikely(error)) {
+		fput(&ff->file);
+		return ERR_PTR(error);
+	}
+
 	return &ff->file;
 }
 EXPORT_SYMBOL_GPL(alloc_empty_backing_file);
-- 
2.53.0


^ permalink raw reply related

* [PATCH v4 0/3] Fix incorrect overlayfs mmap() and mprotect() LSM access controls
From: Paul Moore @ 2026-04-03  3:08 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs
  Cc: Amir Goldstein, Gao Xiang, Christian Brauner

Another week, another revision to this patchset.  The v3 revision can be
found at the lore[1] link below.

The revision still takes the same basic approach introduced in v2, with
the most significant change in v4 being the change to make the backing
file LSM blob conditional on CONFIG_SECURITY.  This requires a number of
other changes to ensure that all accesses of the LSM blob go through a
set of accessor functions which can be converted into dummy functions
when !CONFIG_SECURITY.

While the changes between v3 and v4 were fairly straight forward, there
were enough of them that it felt wrong to preserve the ACKs from previous
revisions.  It would be appreciated if those of you who had previously
ACK'd a patch could take a second look and renew your ACK (or comment on
the problem preventing you from ACK'ing).

Thanks all.

[1] https://lore.kernel.org/linux-security-module/20260327220446.353103-4-paul@paul-moore.com/

--
CHANGELOG:
v4:
- added fs prep patch (Amir)
- added CONFIG_SECURITY conditional code (Amir)
v3:
- fix the LSM hook stubs (kernel robot, Ryan Lee)
- fix the lsm_backing_file_cache allocation size (Ryan Lee)
- minor style, simplicity tweaks to the SELinux patch
v2:
- remove the user O_PATH file patch from Amir
- add the backing_file LSM blob and lifecycle hooks
- update the SELinux code to reflect the other changes
v1:
- initial version

--
Amir Goldstein (1):
      fs: prepare for adding LSM blob to backing_file

Paul Moore (2):
      lsm: add backing_file LSM hooks
      selinux: fix overlayfs mmap() and mprotect() access checks

 fs/backing-file.c                 |   18 +-
 fs/erofs/ishare.c                 |   10 +
 fs/file_table.c                   |   43 ++++-
 fs/fuse/passthrough.c             |    2 
 fs/internal.h                     |    3 
 fs/overlayfs/dir.c                |    2 
 fs/overlayfs/file.c               |    2 
 include/linux/backing-file.h      |    4 
 include/linux/fs.h                |   13 +
 include/linux/lsm_audit.h         |    2 
 include/linux/lsm_hook_defs.h     |    5 
 include/linux/lsm_hooks.h         |    1 
 include/linux/security.h          |   22 ++
 security/lsm.h                    |    1 
 security/lsm_init.c               |    9 +
 security/security.c               |  102 +++++++++++
 security/selinux/hooks.c          |  256 +++++++++++++++++++++---------
 security/selinux/include/objsec.h |   11 +
 18 files changed, 419 insertions(+), 87 deletions(-)

^ permalink raw reply

* Re: LSM namespacing API
From: Paul Moore @ 2026-04-02 21:04 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Stephen Smalley, Ondrej Mosnacek, linux-security-module, selinux,
	John Johansen
In-Reply-To: <ac5MKr4lFQhc44i6@wind.enjellic.com>

On Thu, Apr 2, 2026 at 7:00 AM Dr. Greg <greg@enjellic.com> wrote:
> On Sun, Mar 29, 2026 at 08:56:37PM -0400, Paul Moore wrote:
> > On Sun, Mar 29, 2026 at 12:09???PM Dr. Greg <greg@enjellic.com> wrote:
> > > On Tue, Mar 24, 2026 at 05:31:09PM -0400, Paul Moore wrote:
> > > > On Tue, Mar 3, 2026 at 11:46???AM Paul Moore <paul@paul-moore.com> wrote:

...

> Christian had proposed patches for a generic mechanism to create
> LSM security namespace blobs, is implementation of that in scope for
> this effort?

That isn't what Christian proposed, although I can understand how a
quick glance at the patchset would lead you to believe that (I had the
same misunderstanding while skimming my inbox on my phone while
traveling).  I suggest reviewing Christian's post again as well as the
related Landlock patchset which is the first to use the hooks
Christian proposed.

> > > It would seem that the flags variable might be a good option to use to
> > > handle this 2-stage transition, for example LSM_NS_INIT and
> > > LSM_NS_CHANGE, respectively, to specify the initialization and
> > > execution phases of the transition.
>
> > No.  The lsm_unshare() syscall is intended to mimic the existing
> > unshare() syscall as a single step process from a user's
> > perspective.  If it returns successfully the caller will be in a new
> > LSM namespace as defined by the individual LSM specified in the
> > syscall.
>
> OK, we can reason forward with that paradigm.
>
> An orchestrator issues the unshare call for an LSM namespace and upon
> return from the system call the calling task is in a new namespace for
> that particular LSM ...

Yes.

> ... the goal of which is presumably to implement a
> security policy/model different than what had been in force
> previously.

Maybe.  That is dependent on the individual LSM, I don't want to
encode any assumptions on this at the LSM framework layer.

> So the process is in a new LSM specific namespace, but still
> implementing the policy from the previous namespace, until the
> orchestrator can load the new policy and then trigger the LSM to
> change from its previous policy to the newly loaded policy.
>
> Is this consistent with your vision as to how all of this will work?

No.  What an individual LSM does upon creation of a new namespace via
lsm_unshare() is entirely up to that LSM.  The LSM may choose to bound
the new namespace by the parent's policy, or it may choose a
non-hierarchical relationship where the new namespace remains entirely
separate from the parent.  The LSM may start the new namespace in an
uninitialized state (similar to early boot), initialized with a
default policy, initialized with the parent's policy, or something
else.

> > > The other unanswered issue, or perhaps we missed it, are the security
> > > controls that should be associated with the unshare call.
>
> > Each LSM is free to implement whatever access controls it deems
> > necessary in its lsm_unshare() callback.
>
> Just to be clear.
>
> When you refer to 'lsm_unshare() callback' are you referring to a new
> LSM security hook to be be implemented that will allow all of the
> active LSM's to pass judgement on whether or not the unshare should be
> allowed to complete successfully?

No.  The lsm_unshare() callback is the individual LSM provided
function that the LSM framework calls when the lsm_unshare() syscall
is invoked.  Put another way, the lsm_unshare() callback is the
function specified by a LSM, using the LSM_HOOK_INIT() macro, that is
called by the lsm_unshare() syscall.

> > > Will there be a new LSM hook that allows other LSM's to veto the
> > > creation of a namespace either for itself or for another LSM?
> >
> > I would expect the lsm_unshare() syscall to operate similarly to the
> > lsm_set_self_attr() syscall in this regard.
>
> The reference to handling this like lsm_set_self_attr() is unclear.
>
> With lsm_set_self_attr() there is no reason for another LSM to deny
> setting what is an LSM specific attribute, as you note above, each LSM
> gets to decide if the request to set an attribute for the LSM should
> be accepted or denied.

No.  LSM "A" gets to decide if LSM "A" can create a new namespace
using the lsm_unshare() syscall, LSM "B" does not get to enforce any
policy on LSM "A"'s decision.

> Since lsm_unshare() is changing the overall platform security state,
> it seems consistent with the design of the LSM for other LSM's to be
> able to veto this action.

No.  This is not consistent with either the design or general
conventions associated with LSM development.

> Once again, this seems like an action that would be consistent with
> the notion of the lockdown LSM,

No.

> > > Should there be an option to completely compile LSM namespaces out of
> > > the kernel?
>
> > That doesn't belong in the LSM framework layer, that is up to the
> > individual LSMs.
>
> You noted above the desire for lsm_unshare to be consistent with other
> namespaces.
>
> The current kernel paradigm is to allow classes of namespace
> resources, ie. CONFIG_UTS_NS, CONFIG_TIME_NS et.al., to be compiled in
> our out of the kernel.
>
> It seems that CONFIG_LSM_NS would be consistent with that model.

CONFIG_UTS_NS does not have multiple radically different
implementations underneath it.  Comparing any of the existing Kconfig
namespace knobs to what we are attempting to do with the LSM framework
is going to be difficult due to some inherent differences between the
two things.

The lsm_unshare() syscall is simply an API abstraction intended to
make it easier for userspace to interact with the individual LSMs;
instead of dealing with multiple different namespacing APIs, one for
each LSM, lsm_unshare() provides a single interface to make app devs'
lives easier.

If a individual LSM wants to provide a Kconfig knob to toggle their
namespace support they are welcome to do so, lsm_unshare() should
exist regardless and return an error code if the desired LSM does not
implement namespace support in the particular kernel build.

> > > > * Implement /proc/pid/ns/lsm and setns(CLONE_NEWLSM)
> > > >
> > > > As discussed previously, this allows us to move a process into an
> > > > existing, established LSM namespace set.  The caller cannot
> > > > selectively choose which individual LSM namespaces they join from the
> > > > given LSM namespace set, they receive the same LSM namespace
> > > > configuration as the target process.
> > >
> > > As an initial aside.  It would be assumed that a positive result of a
> > > setns call would be to cause the calling process to atomically change
> > > its security namespace set.  This would further suggest the need to
> > > have the security namespace creation process also execute atomically
> > > in a multi-LSM namespace change environment.
>
> > In the setns case no new LSM namespaces should be created, the process
> > simply joins an existing set of LSM namespaces.
>
> The issue isn't about new namespaces being created, the issue is
> atomicity of a change to a new set of security policies.
>
> With setns an atomic transition is implemented.
>
> The proposed lsm_unshare() behavior results in a period of time when
> multiple and varying security policies are active, depending on
> various race issues in the orchestrator implementation.
>
> This opens the door to a raft of potential security issues that we can
> have a new acronym for, Time Of Implementation Time Of Use (TOITOU).

I would expect that any LSM implementing namespaces would have
sufficient protections/locking in place to ensure that processes and
namespaces remain in a consistent state outside of the
protected/locked regions.  It is reasonable for one process to attempt
the creation of a new namespace while another attempts to join the
namespace of the process creating the new namespace.  This is not
really a new problem in systems programming, and is one reason why
synchronization mechanisms exist.  Once again, we do not want to force
any particular solution at the LSM framework layer as the
synchonization mechanisms will likely be very LSM dependent.

> > > ... That is the concept of whether or not a setns
> > > call, for any resource namespace, should also force a security
> > > namespace change if the security namespace of the calling process
> > > differs from that of the target process.
>
> > That decision is left to the individual LSMs.
>
> That is reasonable.
>
> In order to support that model, there would seem to be a need to have
> a new LSM call in the setns code that allows LSM's to determine
> whether or not a change in the active security namespace set should be
> forced, correct?

Possibly.  I think we need to see some RFC code to see how this would
look, but I think the LSM implementation inside the setns() syscall
would need to be done in two stages: the first to "prepare" the join
operation where permissions checks are performed (if desired by the
individual LSM) and any operations that could fail are done; the
second stage would be very basic and simply finish the join operation
without any risk of failure.  An individual LSM could fail the join
operation for a variety of reasons in stage 1, causing the entire
setns() operation to fail, but once we progress to stage 2 the
operation should succeed.

At this point I'm not too bothered by how we do this as it is an
implementation detail buried within the setns() implementation and not
really an API issue.  We could create a single LSM hook that is called
within sys_setns(), or we could leverage the existing two-stage
process within sys_setns() and implement the two LSM stages as two LSM
hooks.  The first option would be more complicated from a LSM
perspective, but cleaner from a nsproxy.c perspective (that alone
could make it the more preferable option).  The latter option would
result in cleaner, thinner LSM hooks, but it would likley add
complexity to ns_common and/or nsset.  As I said earlier, this is a
decision that will likely be decided by how the code ends up looking.

> If so, is implementation of this in scope for the lsm_unshare()
> infrastructure?

No.  The lsm_unshare() syscall would only operate on one LSM at a time
so a two stage process isn't needed at the LSM framework layer.  It is
possible that an individual LSM may want to implement a two-stage
transaction in their lsm_unshare() callback, but that is their
decision.

> To close, at the risk of being the devils advocate.
>
> Given that the sentiment is to force almost all of these
> issues/decisions into the individual LSM's, what is the advantage of
> having a common lsm_unshare() system call?

A single uniform API for userspace applications that wish to make use
of LSM namespaces.  Ideally we want to leverage the existing kernel
APIs, e.g. procfs and setns(), but others, e.g. clone(), remain
impractical due to a combination of technical and political reasons
(we've already discussed some of the former, the latter is a rathole
discussion I'm not going to engage in at the moment).

> In the proposed model, a resource orchestrator is going to need to
> have extensive knowledge over the mechanics of all the LSM's that
> implement namespace functionality.

Maybe.  I don't think orchestrators will need to have "extensive"
knowledge of the individual LSMs, although this largely depends on
what you define as "extensive".

I also want to get ahead of this and say that I have absolutely zero
desire to debate this point with you at the moment.  It's an argument
without end and the discussion is unlikely to yield anything specific
enough to be helpful.

> At a very minimum, intrinsic to
> the concept of security namespaces, there will be a need to load a new
> policy or model into the namespace, an action that will be deeply LSM
> specific.

Possibly, as this is once again very LSM dependent.  Some LSMs may not
need a new policy loaded when they create a new namespace.

I will also, once again, point you at the LSM policy loading syscall
ideas.  While on hold, we've already discussed that they should be
namespace aware and potentially have the ability to trigger new LSM
namespace creation.

> At this point, the only common functionality may be the allocation of
> a new LSM namespace 'blob'.

Now you are starting to get it.  The LSM framework exists primarily as
a multiplexing layer hidden beneath an API.  Originally the API was
only for internal kernel users, but recently we started providing a
userspace syscall API.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v3 0/5] Fix Landlock audit test flakiness
From: Günther Noack @ 2026-04-02 20:57 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Justin Suess,
	Tingmao Wang
In-Reply-To: <20260402.eb5c4e85f472@gnoack.org>

On Thu, Apr 02, 2026 at 10:52:46PM +0200, Günther Noack wrote:
> My kernel config is this:
> 
>     make defconfig
>     make kvm_guest.config
>     KCONFIG_CONFIG="${KBUILD_OUTPUT}/.config" ./scripts/kconfig/merge_config.sh "${KBUILD_OUTPUT}/.config" tools/testing/selftests/landlock/config
>     make debug.config
>     echo "CONFIG_RANDOMIZE_BASE=n" >> "${KBUILD_OUTPUT}/.config"
>     make olddefconfig

P.S.: I should point out, everytime that I have observed these
flakiness problems with the audit tests, it was in this debug config.
I suspect that it adds delays in a way that makes it more likely.

–Günther

^ permalink raw reply

* Re: [PATCH v3 0/5] Fix Landlock audit test flakiness
From: Günther Noack @ 2026-04-02 20:52 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Justin Suess,
	Tingmao Wang
In-Reply-To: <20260402192608.1458252-1-mic@digikod.net>

Hello!

On Thu, Apr 02, 2026 at 09:26:01PM +0200, Mickaël Salaün wrote:
> This series fixes two classes of audit selftest failures plus two minor
> bugs in the audit test helpers.
> 
> The main issue is that domain deallocation audit records are emitted
> asynchronously from kworker threads and can arrive after a previous
> test's socket has been closed.  This causes two distinct failure modes:
> 
> - audit_match_record() picks up a stale deallocation record from a
>   previous test instead of the expected one, causing a domain ID
>   mismatch.  The audit.layers test (which reads 16 deallocation records
>   in sequence) is particularly vulnerable because the large read window
>   allows stale records to interleave.  Patch 4 fixes this by filtering
>   deallocation records by domain ID and skipping type-matching records
>   with wrong content patterns.
> 
> - audit_count_records() counts stale deallocation records from a
>   previous test, incrementing records.domain from the expected 0 to 1.
>   Patch 3 fixes this by draining stale records at audit_init() time and
>   removing records.domain == 0 checks that are not preceded by
>   audit_match_record() calls (which would consume stale records).
> 
> These races are more likely to manifest when additional instrumentation
> changes kworker timing in the deallocation path (e.g. with the upcoming
> Landlock tracepoints work).
> 
> The two minor fixes (patches 1-2) correct a snprintf truncation check
> off-by-one and socket file descriptor leaks on error paths in
> audit_init(), audit_init_with_exe_filter(), and audit_cleanup().
> Patch 5 fixes a __u64 format warning reported by the kbuild bot on
> powerpc64.
> 
> Patch 1 is an exact subset of the v1 combined patch, which is why it
> carries the Reviewed-by tag.  Patches 2 and 3 extend beyond what was in
> v1, so the Reviewed-by is not carried.  Patches 4 and 5 are new.
> 
> Changes since v2:
> https://lore.kernel.org/r/20260401161503.1136946-1-mic@digikod.net
> - Patches 4-5: fix __u64 format warnings on powerpc64 (cast to unsigned
>   long long for %llx).  Patch 5 is new.
> 
> Changes since v1:
> https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
> - Split the combined drain fix into four separate patches.
> - Patch 2: extend fd leak fix to audit_init_with_exe_filter() and
>   audit_cleanup().
> - Patch 3: also remove domain checks from audit.trace and
>   scoped_audit.connect_to_child, document constraint, explain why a
>   longer drain timeout was rejected.
> - Patch 4: new, add domain ID filtering and timeout management to
>   matches_log_domain_deallocated(), skip stale records in
>   audit_match_record().
> 
> Mickaël Salaün (5):
>   selftests/landlock: Fix snprintf truncation checks in audit helpers
>   selftests/landlock: Fix socket file descriptor leaks in audit helpers
>   selftests/landlock: Drain stale audit records on init
>   selftests/landlock: Skip stale records in audit_match_record()
>   selftests/landlock: Fix format warning for __u64 in net_test
> 
>  tools/testing/selftests/landlock/audit.h      | 133 ++++++++++++++----
>  tools/testing/selftests/landlock/audit_test.c |  36 ++---
>  tools/testing/selftests/landlock/net_test.c   |   2 +-
>  .../testing/selftests/landlock/ptrace_test.c  |   1 -
>  .../landlock/scoped_abstract_unix_test.c      |   1 -
>  5 files changed, 119 insertions(+), 54 deletions(-)
> 
> -- 
> 2.53.0
> 

I am still getting flaky audit tests even with these patches, I am
afraid.  It differs which of these tests is flaking, some of them
still do, for example:

#  RUN           audit_layout1.remove_dir ...
# fs_test.c:7281:remove_dir:Expected 0 (0) == matches_log_fs(_metadata, self->audit_fd, "fs\\.remove_dir", dir_s1d2) (-11)
# remove_dir: Test failed
#          ❌ FAIL  audit_layout1.remove_dir
not ok 191 audit_layout1.remove_dir
#  RUN           audit_layout1.read_dir ...
#            ✅ OK  audit_layout1.read_dir
ok 192 audit_layout1.read_dir
#  RUN           audit_layout1.read_file ...
#            ✅ OK  audit_layout1.read_file
ok 193 audit_layout1.read_file
#  RUN           audit_layout1.write_file ...
# fs_test.c:7221:write_file:Expected 0 (0) == matches_log_fs(_metadata, self->audit_fd, "fs\\.write_file", file1_s1d1) (-11)
# fs_test.c:7224:write_file:Expected 0 (0) == records.access (1)
# write_file: Test failed
#          ❌ FAIL  audit_layout1.write_file
not ok 194 audit_layout1.write_file

My kernel config is this:

    make defconfig
    make kvm_guest.config
    KCONFIG_CONFIG="${KBUILD_OUTPUT}/.config" ./scripts/kconfig/merge_config.sh "${KBUILD_OUTPUT}/.config" tools/testing/selftests/landlock/config
    make debug.config
    echo "CONFIG_RANDOMIZE_BASE=n" >> "${KBUILD_OUTPUT}/.config"
    make olddefconfig

and then I run the selftests in Qemu with these flags:

qemu-system-x86_64 \
    -nographic \
    -m 4G \
    -enable-kvm \
    -append "console=ttyS0 lsm=landlock no_hash_pointers" \
    -kernel "${KBUILD_OUTPUT}/arch/x86/boot/bzImage" \
    -initrd "${INITRAMFS}"

This is using my own selftest runner scripts which builds an initramfs
with the statically linked selftests.

Do you have a hunch what might be missing there?  In the test run
above, I have applied your V4 patch set on top of the current master,
5619b098e2fbf3a23bf13d91897056a1fe238c6d ("Merge tag 'for-7.0-rc6-tag'
of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux").

–Günther

^ permalink raw reply

* Re: [PATCH v3 1/5] selftests/landlock: Fix snprintf truncation checks in audit helpers
From: Günther Noack @ 2026-04-02 20:30 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Justin Suess,
	Tingmao Wang, stable
In-Reply-To: <20260402192608.1458252-2-mic@digikod.net>

On Thu, Apr 02, 2026 at 09:26:02PM +0200, Mickaël Salaün wrote:
> snprintf() returns the number of characters that would have been
> written, excluding the terminating NUL byte.  When the output is
> truncated, this return value equals or exceeds the buffer size.  Fix
> matches_log_domain_allocated() and matches_log_domain_deallocated() to
> detect truncation with ">=" instead of ">".
> 
> Cc: Günther Noack <gnoack@google.com>
> Cc: stable@vger.kernel.org
> Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
> Reviewed-by: Günther Noack <gnoack@google.com>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
> 
> Changes since v1:
> https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
> - New patch (split from the drain fix).
> ---
>  tools/testing/selftests/landlock/audit.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
> index 44eb433e9666..1049a0582af5 100644
> --- a/tools/testing/selftests/landlock/audit.h
> +++ b/tools/testing/selftests/landlock/audit.h
> @@ -309,7 +309,7 @@ static int __maybe_unused matches_log_domain_allocated(int audit_fd, pid_t pid,
>  
>  	log_match_len =
>  		snprintf(log_match, sizeof(log_match), log_template, pid);
> -	if (log_match_len > sizeof(log_match))
> +	if (log_match_len >= sizeof(log_match))
>  		return -E2BIG;
>  
>  	return audit_match_record(audit_fd, AUDIT_LANDLOCK_DOMAIN, log_match,
> @@ -326,7 +326,7 @@ static int __maybe_unused matches_log_domain_deallocated(
>  
>  	log_match_len = snprintf(log_match, sizeof(log_match), log_template,
>  				 num_denials);
> -	if (log_match_len > sizeof(log_match))
> +	if (log_match_len >= sizeof(log_match))
>  		return -E2BIG;
>  
>  	return audit_match_record(audit_fd, AUDIT_LANDLOCK_DOMAIN, log_match,
> -- 
> 2.53.0
> 

Reviewed-by: Günther Noack <gnoack3000@gmail.com>

(I noticed the Reviewed-by tag was already there, re-sending to
confirm that this also applies to this subset of the original patch)

–Günther

^ permalink raw reply

* Re: [PATCH v3 3/5] selftests/landlock: Drain stale audit records on init
From: Günther Noack @ 2026-04-02 20:28 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Justin Suess,
	Tingmao Wang, stable
In-Reply-To: <20260402192608.1458252-4-mic@digikod.net>

On Thu, Apr 02, 2026 at 09:26:04PM +0200, Mickaël Salaün wrote:
> Non-audit Landlock tests generate audit records as side effects when
> audit_enabled is non-zero (e.g. from boot configuration).  These records
> accumulate in the kernel audit backlog while no audit daemon socket is
> open.  When the next test opens a new netlink socket and registers as
> the audit daemon, the stale backlog is delivered, causing baseline
> record count checks to fail spuriously.
> 
> Fix this by draining all pending records in audit_init() right after
> setting the receive timeout.  The 1-usec SO_RCVTIMEO causes audit_recv()
> to return -EAGAIN once the backlog is empty, naturally terminating the
> drain loop.
> 
> Domain deallocation records are emitted asynchronously from a work
> queue, so they may still arrive after the drain.  Remove records.domain
> == 0 checks that are not preceded by audit_match_record() calls, which
> would otherwise consume stale records before the count.  Document this
> constraint above audit_count_records().
> 
> Increasing the drain timeout to catch in-flight deallocation records was
> considered but rejected: a longer timeout adds latency to every
> audit_init() call even when no stale record is pending, and any fixed
> timeout is still not guaranteed to catch all records under load.
> Removing the unprotected checks is simpler and avoids the spurious
> failures.
> 
> Cc: Günther Noack <gnoack@google.com>
> Cc: stable@vger.kernel.org
> Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
> 
> Changes since v1:
> https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
> - Also remove domain checks from audit.trace and
>   scoped_audit.connect_to_child.
> - Document records.domain == 0 constraint above
>   audit_count_records().
> - Explain why a longer drain timeout was rejected.
> - Drop Reviewed-by (new code comment not in v1).
> - Split snprintf and fd leak fixes into separate patches.
> ---
>  tools/testing/selftests/landlock/audit.h      | 19 +++++++++++++++++++
>  tools/testing/selftests/landlock/audit_test.c |  2 --
>  .../testing/selftests/landlock/ptrace_test.c  |  1 -
>  .../landlock/scoped_abstract_unix_test.c      |  1 -
>  4 files changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
> index 6422943fc69e..74e1c3d763be 100644
> --- a/tools/testing/selftests/landlock/audit.h
> +++ b/tools/testing/selftests/landlock/audit.h
> @@ -338,6 +338,15 @@ struct audit_records {
>  	size_t domain;
>  };
>  
> +/*
> + * WARNING: Do not assert records.domain == 0 without a preceding
> + * audit_match_record() call.  Domain deallocation records are emitted
> + * asynchronously from kworker threads and can arrive after the drain in
> + * audit_init(), corrupting the domain count.  A preceding audit_match_record()
> + * call consumes stale records while scanning, making the assertion safe in
> + * practice because stale deallocation records arrive before the expected access
> + * records.
> + */
>  static int audit_count_records(int audit_fd, struct audit_records *records)
>  {
>  	struct audit_message msg;
> @@ -393,6 +402,16 @@ static int audit_init(void)
>  		goto err_close;
>  	}
>  
> +	/*
> +	 * Drains stale audit records that accumulated in the kernel backlog
> +	 * while no audit daemon socket was open.  This happens when non-audit
> +	 * Landlock tests generate records while audit_enabled is non-zero (e.g.
> +	 * from boot configuration), or when domain deallocation records arrive
> +	 * asynchronously after a previous test's socket was closed.
> +	 */
> +	while (audit_recv(fd, NULL) == 0)
> +		;
> +
>  	return fd;
>  
>  err_close:
> diff --git a/tools/testing/selftests/landlock/audit_test.c b/tools/testing/selftests/landlock/audit_test.c
> index 46d02d49835a..f92ba6774faa 100644
> --- a/tools/testing/selftests/landlock/audit_test.c
> +++ b/tools/testing/selftests/landlock/audit_test.c
> @@ -412,7 +412,6 @@ TEST_F(audit_flags, signal)
>  		} else {
>  			EXPECT_EQ(1, records.access);
>  		}
> -		EXPECT_EQ(0, records.domain);
>  
>  		/* Updates filter rules to match the drop record. */
>  		set_cap(_metadata, CAP_AUDIT_CONTROL);
> @@ -601,7 +600,6 @@ TEST_F(audit_exec, signal_and_open)
>  	/* Tests that there was no denial until now. */
>  	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
>  	EXPECT_EQ(0, records.access);
> -	EXPECT_EQ(0, records.domain);
>  
>  	/*
>  	 * Wait for the child to do a first denied action by layer1 and
> diff --git a/tools/testing/selftests/landlock/ptrace_test.c b/tools/testing/selftests/landlock/ptrace_test.c
> index 4f64c90583cd..1b6c8b53bf33 100644
> --- a/tools/testing/selftests/landlock/ptrace_test.c
> +++ b/tools/testing/selftests/landlock/ptrace_test.c
> @@ -342,7 +342,6 @@ TEST_F(audit, trace)
>  	/* Makes sure there is no superfluous logged records. */
>  	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
>  	EXPECT_EQ(0, records.access);
> -	EXPECT_EQ(0, records.domain);
>  
>  	yama_ptrace_scope = get_yama_ptrace_scope();
>  	ASSERT_LE(0, yama_ptrace_scope);
> diff --git a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
> index 72f97648d4a7..c47491d2d1c1 100644
> --- a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
> +++ b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
> @@ -312,7 +312,6 @@ TEST_F(scoped_audit, connect_to_child)
>  	/* Makes sure there is no superfluous logged records. */
>  	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
>  	EXPECT_EQ(0, records.access);
> -	EXPECT_EQ(0, records.domain);
>  
>  	ASSERT_EQ(0, pipe2(pipe_child, O_CLOEXEC));
>  	ASSERT_EQ(0, pipe2(pipe_parent, O_CLOEXEC));
> -- 
> 2.53.0
> 

Reviewed-by: Günther Noack <gnoack3000@gmail.com>

^ permalink raw reply

* Re: [PATCH v3 2/5] selftests/landlock: Fix socket file descriptor leaks in audit helpers
From: Günther Noack @ 2026-04-02 20:25 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Justin Suess,
	Tingmao Wang, stable
In-Reply-To: <20260402192608.1458252-3-mic@digikod.net>

On Thu, Apr 02, 2026 at 09:26:03PM +0200, Mickaël Salaün wrote:
> audit_init() opens a netlink socket and configures it, but leaks the
> file descriptor if audit_set_status() or setsockopt() fails.  Fix this
> by jumping to an error path that closes the socket before returning.
> 
> Apply the same fix to audit_init_with_exe_filter(), which leaks the file
> descriptor from audit_init() if audit_init_filter_exe() or
> audit_filter_exe() fails, and to audit_cleanup(), which leaks it if
> audit_init_filter_exe() fails in FIXTURE_TEARDOWN_PARENT().
> 
> Cc: Günther Noack <gnoack@google.com>
> Cc: stable@vger.kernel.org
> Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
> 
> Changes since v1:
> https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
> - New patch (split from the drain fix, extended to
>   audit_init_with_exe_filter() and audit_cleanup()).
> ---
>  tools/testing/selftests/landlock/audit.h | 26 +++++++++++++++++-------
>  1 file changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
> index 1049a0582af5..6422943fc69e 100644
> --- a/tools/testing/selftests/landlock/audit.h
> +++ b/tools/testing/selftests/landlock/audit.h
> @@ -379,19 +379,25 @@ static int audit_init(void)
>  
>  	err = audit_set_status(fd, AUDIT_STATUS_ENABLED, 1);
>  	if (err)
> -		return err;
> +		goto err_close;
>  
>  	err = audit_set_status(fd, AUDIT_STATUS_PID, getpid());
>  	if (err)
> -		return err;
> +		goto err_close;
>  
>  	/* Sets a timeout for negative tests. */
>  	err = setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, &audit_tv_default,
>  			 sizeof(audit_tv_default));
> -	if (err)
> -		return -errno;
> +	if (err) {
> +		err = -errno;
> +		goto err_close;
> +	}
>  
>  	return fd;
> +
> +err_close:
> +	close(fd);
> +	return err;
>  }
>  
>  static int audit_init_filter_exe(struct audit_filter *filter, const char *path)
> @@ -441,8 +447,10 @@ static int audit_cleanup(int audit_fd, struct audit_filter *filter)
>  
>  		filter = &new_filter;
>  		err = audit_init_filter_exe(filter, NULL);
> -		if (err)
> +		if (err) {
> +			close(audit_fd);
>  			return err;
> +		}
>  	}
>  
>  	/* Filters might not be in place. */
> @@ -468,11 +476,15 @@ static int audit_init_with_exe_filter(struct audit_filter *filter)
>  
>  	err = audit_init_filter_exe(filter, NULL);
>  	if (err)
> -		return err;
> +		goto err_close;
>  
>  	err = audit_filter_exe(fd, filter, AUDIT_ADD_RULE);
>  	if (err)
> -		return err;
> +		goto err_close;
>  
>  	return fd;
> +
> +err_close:
> +	close(fd);
> +	return err;
>  }
> -- 
> 2.53.0
> 

Reviewed-by: Günther Noack <gnoack3000@gmail.com>

^ permalink raw reply

* Re: [PATCH v3 5/5] selftests/landlock: Fix format warning for __u64 in net_test
From: Günther Noack @ 2026-04-02 20:21 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Justin Suess,
	Tingmao Wang, stable, kernel test robot
In-Reply-To: <20260402192608.1458252-6-mic@digikod.net>

On Thu, Apr 02, 2026 at 09:26:06PM +0200, Mickaël Salaün wrote:
> On architectures where __u64 is unsigned long (e.g. powerpc64), using
> %llx to format a __u64 triggers a -Wformat warning because %llx expects
> unsigned long long.  Cast the argument to unsigned long long.
> 
> Cc: Günther Noack <gnoack@google.com>
> Cc: stable@vger.kernel.org
> Fixes: a549d055a22e ("selftests/landlock: Add network tests")
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/r/202604020206.62zgOTeP-lkp@intel.com/
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
> 
> Changes since v2:
> - New patch.
> ---
>  tools/testing/selftests/landlock/net_test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/landlock/net_test.c b/tools/testing/selftests/landlock/net_test.c
> index b34b139b3f89..4c528154ea92 100644
> --- a/tools/testing/selftests/landlock/net_test.c
> +++ b/tools/testing/selftests/landlock/net_test.c
> @@ -1356,7 +1356,7 @@ TEST_F(mini, network_access_rights)
>  					    &net_port, 0))
>  		{
>  			TH_LOG("Failed to add rule with access 0x%llx: %s",
> -			       access, strerror(errno));
> +			       (unsigned long long)access, strerror(errno));
>  		}
>  	}
>  	EXPECT_EQ(0, close(ruleset_fd));
> -- 
> 2.53.0
> 

Reviewed-by: Günther Noack <gnoack3000@gmail.com>

^ permalink raw reply

* [PATCH v3 2/5] selftests/landlock: Fix socket file descriptor leaks in audit helpers
From: Mickaël Salaün @ 2026-04-02 19:26 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, Justin Suess,
	Tingmao Wang, stable
In-Reply-To: <20260402192608.1458252-1-mic@digikod.net>

audit_init() opens a netlink socket and configures it, but leaks the
file descriptor if audit_set_status() or setsockopt() fails.  Fix this
by jumping to an error path that closes the socket before returning.

Apply the same fix to audit_init_with_exe_filter(), which leaks the file
descriptor from audit_init() if audit_init_filter_exe() or
audit_filter_exe() fails, and to audit_cleanup(), which leaks it if
audit_init_filter_exe() fails in FIXTURE_TEARDOWN_PARENT().

Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---

Changes since v1:
https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
- New patch (split from the drain fix, extended to
  audit_init_with_exe_filter() and audit_cleanup()).
---
 tools/testing/selftests/landlock/audit.h | 26 +++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
index 1049a0582af5..6422943fc69e 100644
--- a/tools/testing/selftests/landlock/audit.h
+++ b/tools/testing/selftests/landlock/audit.h
@@ -379,19 +379,25 @@ static int audit_init(void)
 
 	err = audit_set_status(fd, AUDIT_STATUS_ENABLED, 1);
 	if (err)
-		return err;
+		goto err_close;
 
 	err = audit_set_status(fd, AUDIT_STATUS_PID, getpid());
 	if (err)
-		return err;
+		goto err_close;
 
 	/* Sets a timeout for negative tests. */
 	err = setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, &audit_tv_default,
 			 sizeof(audit_tv_default));
-	if (err)
-		return -errno;
+	if (err) {
+		err = -errno;
+		goto err_close;
+	}
 
 	return fd;
+
+err_close:
+	close(fd);
+	return err;
 }
 
 static int audit_init_filter_exe(struct audit_filter *filter, const char *path)
@@ -441,8 +447,10 @@ static int audit_cleanup(int audit_fd, struct audit_filter *filter)
 
 		filter = &new_filter;
 		err = audit_init_filter_exe(filter, NULL);
-		if (err)
+		if (err) {
+			close(audit_fd);
 			return err;
+		}
 	}
 
 	/* Filters might not be in place. */
@@ -468,11 +476,15 @@ static int audit_init_with_exe_filter(struct audit_filter *filter)
 
 	err = audit_init_filter_exe(filter, NULL);
 	if (err)
-		return err;
+		goto err_close;
 
 	err = audit_filter_exe(fd, filter, AUDIT_ADD_RULE);
 	if (err)
-		return err;
+		goto err_close;
 
 	return fd;
+
+err_close:
+	close(fd);
+	return err;
 }
-- 
2.53.0


^ permalink raw reply related

* Re: LSM namespacing API
From: Paul Moore @ 2026-04-02 19:31 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Casey Schaufler, Stephen Smalley, Ondrej Mosnacek,
	linux-security-module, selinux, John Johansen
In-Reply-To: <5e210223-f9a4-4613-8c4b-bea5eea7f8c0@schaufler-ca.com>

On Thu, Apr 2, 2026 at 1:49 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 4/2/2026 3:59 AM, Dr. Greg wrote:
> > That still leaves the question of whether or not CAP_MAC_ADMIN is
> > appropriate for gating the creation of a new security namespace.
>
> That will have to be up to the individual LSMs.

Yes, exactly.

> Not all LSMs implement Mandatory Access Controls.

... and not all LSMs that implement mandatory access controls rely on
CAP_MAC_ADMIN to gate configuration changes.

-- 
paul-moore.com

^ permalink raw reply

* [PATCH v3 3/5] selftests/landlock: Drain stale audit records on init
From: Mickaël Salaün @ 2026-04-02 19:26 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, Justin Suess,
	Tingmao Wang, stable
In-Reply-To: <20260402192608.1458252-1-mic@digikod.net>

Non-audit Landlock tests generate audit records as side effects when
audit_enabled is non-zero (e.g. from boot configuration).  These records
accumulate in the kernel audit backlog while no audit daemon socket is
open.  When the next test opens a new netlink socket and registers as
the audit daemon, the stale backlog is delivered, causing baseline
record count checks to fail spuriously.

Fix this by draining all pending records in audit_init() right after
setting the receive timeout.  The 1-usec SO_RCVTIMEO causes audit_recv()
to return -EAGAIN once the backlog is empty, naturally terminating the
drain loop.

Domain deallocation records are emitted asynchronously from a work
queue, so they may still arrive after the drain.  Remove records.domain
== 0 checks that are not preceded by audit_match_record() calls, which
would otherwise consume stale records before the count.  Document this
constraint above audit_count_records().

Increasing the drain timeout to catch in-flight deallocation records was
considered but rejected: a longer timeout adds latency to every
audit_init() call even when no stale record is pending, and any fixed
timeout is still not guaranteed to catch all records under load.
Removing the unprotected checks is simpler and avoids the spurious
failures.

Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---

Changes since v1:
https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
- Also remove domain checks from audit.trace and
  scoped_audit.connect_to_child.
- Document records.domain == 0 constraint above
  audit_count_records().
- Explain why a longer drain timeout was rejected.
- Drop Reviewed-by (new code comment not in v1).
- Split snprintf and fd leak fixes into separate patches.
---
 tools/testing/selftests/landlock/audit.h      | 19 +++++++++++++++++++
 tools/testing/selftests/landlock/audit_test.c |  2 --
 .../testing/selftests/landlock/ptrace_test.c  |  1 -
 .../landlock/scoped_abstract_unix_test.c      |  1 -
 4 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
index 6422943fc69e..74e1c3d763be 100644
--- a/tools/testing/selftests/landlock/audit.h
+++ b/tools/testing/selftests/landlock/audit.h
@@ -338,6 +338,15 @@ struct audit_records {
 	size_t domain;
 };
 
+/*
+ * WARNING: Do not assert records.domain == 0 without a preceding
+ * audit_match_record() call.  Domain deallocation records are emitted
+ * asynchronously from kworker threads and can arrive after the drain in
+ * audit_init(), corrupting the domain count.  A preceding audit_match_record()
+ * call consumes stale records while scanning, making the assertion safe in
+ * practice because stale deallocation records arrive before the expected access
+ * records.
+ */
 static int audit_count_records(int audit_fd, struct audit_records *records)
 {
 	struct audit_message msg;
@@ -393,6 +402,16 @@ static int audit_init(void)
 		goto err_close;
 	}
 
+	/*
+	 * Drains stale audit records that accumulated in the kernel backlog
+	 * while no audit daemon socket was open.  This happens when non-audit
+	 * Landlock tests generate records while audit_enabled is non-zero (e.g.
+	 * from boot configuration), or when domain deallocation records arrive
+	 * asynchronously after a previous test's socket was closed.
+	 */
+	while (audit_recv(fd, NULL) == 0)
+		;
+
 	return fd;
 
 err_close:
diff --git a/tools/testing/selftests/landlock/audit_test.c b/tools/testing/selftests/landlock/audit_test.c
index 46d02d49835a..f92ba6774faa 100644
--- a/tools/testing/selftests/landlock/audit_test.c
+++ b/tools/testing/selftests/landlock/audit_test.c
@@ -412,7 +412,6 @@ TEST_F(audit_flags, signal)
 		} else {
 			EXPECT_EQ(1, records.access);
 		}
-		EXPECT_EQ(0, records.domain);
 
 		/* Updates filter rules to match the drop record. */
 		set_cap(_metadata, CAP_AUDIT_CONTROL);
@@ -601,7 +600,6 @@ TEST_F(audit_exec, signal_and_open)
 	/* Tests that there was no denial until now. */
 	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
 	EXPECT_EQ(0, records.access);
-	EXPECT_EQ(0, records.domain);
 
 	/*
 	 * Wait for the child to do a first denied action by layer1 and
diff --git a/tools/testing/selftests/landlock/ptrace_test.c b/tools/testing/selftests/landlock/ptrace_test.c
index 4f64c90583cd..1b6c8b53bf33 100644
--- a/tools/testing/selftests/landlock/ptrace_test.c
+++ b/tools/testing/selftests/landlock/ptrace_test.c
@@ -342,7 +342,6 @@ TEST_F(audit, trace)
 	/* Makes sure there is no superfluous logged records. */
 	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
 	EXPECT_EQ(0, records.access);
-	EXPECT_EQ(0, records.domain);
 
 	yama_ptrace_scope = get_yama_ptrace_scope();
 	ASSERT_LE(0, yama_ptrace_scope);
diff --git a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
index 72f97648d4a7..c47491d2d1c1 100644
--- a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
+++ b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
@@ -312,7 +312,6 @@ TEST_F(scoped_audit, connect_to_child)
 	/* Makes sure there is no superfluous logged records. */
 	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
 	EXPECT_EQ(0, records.access);
-	EXPECT_EQ(0, records.domain);
 
 	ASSERT_EQ(0, pipe2(pipe_child, O_CLOEXEC));
 	ASSERT_EQ(0, pipe2(pipe_parent, O_CLOEXEC));
-- 
2.53.0


^ permalink raw reply related

* [PATCH v3 0/5] Fix Landlock audit test flakiness
From: Mickaël Salaün @ 2026-04-02 19:26 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, Justin Suess,
	Tingmao Wang

This series fixes two classes of audit selftest failures plus two minor
bugs in the audit test helpers.

The main issue is that domain deallocation audit records are emitted
asynchronously from kworker threads and can arrive after a previous
test's socket has been closed.  This causes two distinct failure modes:

- audit_match_record() picks up a stale deallocation record from a
  previous test instead of the expected one, causing a domain ID
  mismatch.  The audit.layers test (which reads 16 deallocation records
  in sequence) is particularly vulnerable because the large read window
  allows stale records to interleave.  Patch 4 fixes this by filtering
  deallocation records by domain ID and skipping type-matching records
  with wrong content patterns.

- audit_count_records() counts stale deallocation records from a
  previous test, incrementing records.domain from the expected 0 to 1.
  Patch 3 fixes this by draining stale records at audit_init() time and
  removing records.domain == 0 checks that are not preceded by
  audit_match_record() calls (which would consume stale records).

These races are more likely to manifest when additional instrumentation
changes kworker timing in the deallocation path (e.g. with the upcoming
Landlock tracepoints work).

The two minor fixes (patches 1-2) correct a snprintf truncation check
off-by-one and socket file descriptor leaks on error paths in
audit_init(), audit_init_with_exe_filter(), and audit_cleanup().
Patch 5 fixes a __u64 format warning reported by the kbuild bot on
powerpc64.

Patch 1 is an exact subset of the v1 combined patch, which is why it
carries the Reviewed-by tag.  Patches 2 and 3 extend beyond what was in
v1, so the Reviewed-by is not carried.  Patches 4 and 5 are new.

Changes since v2:
https://lore.kernel.org/r/20260401161503.1136946-1-mic@digikod.net
- Patches 4-5: fix __u64 format warnings on powerpc64 (cast to unsigned
  long long for %llx).  Patch 5 is new.

Changes since v1:
https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
- Split the combined drain fix into four separate patches.
- Patch 2: extend fd leak fix to audit_init_with_exe_filter() and
  audit_cleanup().
- Patch 3: also remove domain checks from audit.trace and
  scoped_audit.connect_to_child, document constraint, explain why a
  longer drain timeout was rejected.
- Patch 4: new, add domain ID filtering and timeout management to
  matches_log_domain_deallocated(), skip stale records in
  audit_match_record().

Mickaël Salaün (5):
  selftests/landlock: Fix snprintf truncation checks in audit helpers
  selftests/landlock: Fix socket file descriptor leaks in audit helpers
  selftests/landlock: Drain stale audit records on init
  selftests/landlock: Skip stale records in audit_match_record()
  selftests/landlock: Fix format warning for __u64 in net_test

 tools/testing/selftests/landlock/audit.h      | 133 ++++++++++++++----
 tools/testing/selftests/landlock/audit_test.c |  36 ++---
 tools/testing/selftests/landlock/net_test.c   |   2 +-
 .../testing/selftests/landlock/ptrace_test.c  |   1 -
 .../landlock/scoped_abstract_unix_test.c      |   1 -
 5 files changed, 119 insertions(+), 54 deletions(-)

-- 
2.53.0


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox