Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [RFC PATCH v1 01/11] security: add LSM blob and hooks for namespaces
From: Christian Brauner @ 2026-04-10  9:35 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, Paul Moore, Serge E . Hallyn, Justin Suess,
	Lennart Poettering, Mikhail Ivanov, Nicolas Bouchinet,
	Shervin Oloumi, Tingmao Wang, kernel-team, linux-fsdevel,
	linux-kernel, linux-security-module, Daniel Durning
In-Reply-To: <20260409.Mei6Yei0beeZ@digikod.net>

On Thu, Apr 09, 2026 at 06:40:03PM +0200, Mickaël Salaün wrote:
> On Wed, Mar 25, 2026 at 01:31:30PM +0100, Christian Brauner wrote:
> > On Thu, Mar 12, 2026 at 11:04:34AM +0100, Mickaël Salaün wrote:
> > > From: Christian Brauner <brauner@kernel.org>
> > > 
> > > All namespace types now share the same ns_common infrastructure. Extend
> > > this to include a security blob so LSMs can start managing namespaces
> > > uniformly without having to add one-off hooks or security fields to
> > > every individual namespace type.
> > > 
> > > Add a ns_security pointer to ns_common and the corresponding lbs_ns
> > > blob size to lsm_blob_sizes. Allocation and freeing hooks are called
> > > from the common __ns_common_init() and __ns_common_free() paths so
> > > every namespace type gets covered in one go. All information about the
> > > namespace type and the appropriate casting helpers to get at the
> > > containing namespace are available via ns_common making it
> > > straightforward for LSMs to differentiate when they need to.
> > > 
> > > A namespace_install hook is called from validate_ns() during setns(2)
> > > giving LSMs a chance to enforce policy on namespace transitions.
> > > 
> > > Individual namespace types can still have their own specialized security
> > > hooks when needed. This is just the common baseline that makes it easy
> > > to track and manage namespaces from the security side without requiring
> > > every namespace type to reinvent the wheel.
> > > 
> > > Cc: Günther Noack <gnoack@google.com>
> > > Cc: Paul Moore <paul@paul-moore.com>
> > > Cc: Serge E. Hallyn <serge@hallyn.com>
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > Link: https://lore.kernel.org/r/20260216-work-security-namespace-v1-1-075c28758e1f@kernel.org
> > > ---
> > >  include/linux/lsm_hook_defs.h      |  3 ++
> > >  include/linux/lsm_hooks.h          |  1 +
> > >  include/linux/ns/ns_common_types.h |  3 ++
> > >  include/linux/security.h           | 20 ++++++++
> > >  kernel/nscommon.c                  | 12 +++++
> > >  kernel/nsproxy.c                   |  8 +++-
> > >  security/lsm_init.c                |  2 +
> > >  security/security.c                | 76 ++++++++++++++++++++++++++++++
> > >  8 files changed, 124 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> > > index 8c42b4bde09c..fefd3aa6d8f4 100644
> > > --- a/include/linux/lsm_hook_defs.h
> > > +++ b/include/linux/lsm_hook_defs.h
> > > @@ -260,6 +260,9 @@ LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2,
> > >  LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p,
> > >  	 struct inode *inode)
> > >  LSM_HOOK(int, 0, userns_create, const struct cred *cred)
> > > +LSM_HOOK(int, 0, namespace_alloc, struct ns_common *ns)
> > > +LSM_HOOK(void, LSM_RET_VOID, namespace_free, struct ns_common *ns)
> > > +LSM_HOOK(int, 0, namespace_install, const struct nsset *nsset, struct ns_common *ns)
> > >  LSM_HOOK(int, 0, ipc_permission, struct kern_ipc_perm *ipcp, short flag)
> > >  LSM_HOOK(void, LSM_RET_VOID, ipc_getlsmprop, struct kern_ipc_perm *ipcp,
> > >  	 struct lsm_prop *prop)
> > > diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> > > index d48bf0ad26f4..3e7afe76e86c 100644
> > > --- a/include/linux/lsm_hooks.h
> > > +++ b/include/linux/lsm_hooks.h
> > > @@ -111,6 +111,7 @@ struct lsm_blob_sizes {
> > >  	unsigned int lbs_ipc;
> > >  	unsigned int lbs_key;
> > >  	unsigned int lbs_msg_msg;
> > > +	unsigned int lbs_ns;
> > >  	unsigned int lbs_perf_event;
> > >  	unsigned int lbs_task;
> > >  	unsigned int lbs_xattr_count; /* num xattr slots in new_xattrs array */
> > > diff --git a/include/linux/ns/ns_common_types.h b/include/linux/ns/ns_common_types.h
> > > index 0014fbc1c626..170288e2e895 100644
> > > --- a/include/linux/ns/ns_common_types.h
> > > +++ b/include/linux/ns/ns_common_types.h
> > > @@ -115,6 +115,9 @@ struct ns_common {
> > >  	struct dentry *stashed;
> > >  	const struct proc_ns_operations *ops;
> > >  	unsigned int inum;
> > > +#ifdef CONFIG_SECURITY
> > > +	void *ns_security;
> > > +#endif
> > >  	union {
> > >  		struct ns_tree;
> > >  		struct rcu_head ns_rcu;
> > > diff --git a/include/linux/security.h b/include/linux/security.h
> > > index 83a646d72f6f..611b9098367d 100644
> > > --- a/include/linux/security.h
> > > +++ b/include/linux/security.h
> > > @@ -67,6 +67,7 @@ enum fs_value_type;
> > >  struct watch;
> > >  struct watch_notification;
> > >  struct lsm_ctx;
> > > +struct nsset;
> > >  
> > >  /* Default (no) options for the capable function */
> > >  #define CAP_OPT_NONE 0x0
> > > @@ -80,6 +81,7 @@ struct lsm_ctx;
> > >  
> > >  struct ctl_table;
> > >  struct audit_krule;
> > > +struct ns_common;
> > >  struct user_namespace;
> > >  struct timezone;
> > >  
> > > @@ -533,6 +535,9 @@ int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> > >  			unsigned long arg4, unsigned long arg5);
> > >  void security_task_to_inode(struct task_struct *p, struct inode *inode);
> > >  int security_create_user_ns(const struct cred *cred);
> > > +int security_namespace_alloc(struct ns_common *ns);
> > > +void security_namespace_free(struct ns_common *ns);
> > > +int security_namespace_install(const struct nsset *nsset, struct ns_common *ns);
> > >  int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag);
> > >  void security_ipc_getlsmprop(struct kern_ipc_perm *ipcp, struct lsm_prop *prop);
> > >  int security_msg_msg_alloc(struct msg_msg *msg);
> > > @@ -1407,6 +1412,21 @@ static inline int security_create_user_ns(const struct cred *cred)
> > >  	return 0;
> > >  }
> > >  
> > > +static inline int security_namespace_alloc(struct ns_common *ns)
> > > +{
> > > +	return 0;
> > > +}
> > > +
> > > +static inline void security_namespace_free(struct ns_common *ns)
> > > +{
> > > +}
> > > +
> > > +static inline int security_namespace_install(const struct nsset *nsset,
> > > +					     struct ns_common *ns)
> > > +{
> > > +	return 0;
> > > +}
> > > +
> > >  static inline int security_ipc_permission(struct kern_ipc_perm *ipcp,
> > >  					  short flag)
> > >  {
> > > diff --git a/kernel/nscommon.c b/kernel/nscommon.c
> > > index bdc3c86231d3..de774e374f9d 100644
> > > --- a/kernel/nscommon.c
> > > +++ b/kernel/nscommon.c
> > > @@ -4,6 +4,7 @@
> > >  #include <linux/ns_common.h>
> > >  #include <linux/nstree.h>
> > >  #include <linux/proc_ns.h>
> > > +#include <linux/security.h>
> > >  #include <linux/user_namespace.h>
> > >  #include <linux/vfsdebug.h>
> > >  
> > > @@ -59,6 +60,9 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
> > >  
> > >  	refcount_set(&ns->__ns_ref, 1);
> > >  	ns->stashed = NULL;
> > > +#ifdef CONFIG_SECURITY
> > > +	ns->ns_security = NULL;
> > > +#endif
> > >  	ns->ops = ops;
> > >  	ns->ns_id = 0;
> > >  	ns->ns_type = ns_type;
> > > @@ -77,6 +81,13 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
> > >  		ret = proc_alloc_inum(&ns->inum);
> > >  	if (ret)
> > >  		return ret;
> > > +
> > > +	ret = security_namespace_alloc(ns);
> > > +	if (ret) {
> > > +		proc_free_inum(ns->inum);
> > 
> > ret = security_namespace_alloc(ns);
> > if (ret && !inum)
> >         proc_free_inum(ns->inum);
> > return ret;
> > 
> > 
> > > +		return ret;
> > > +	}
> > > +
> > >  	/*
> > >  	 * Tree ref starts at 0. It's incremented when namespace enters
> > >  	 * active use (installed in nsproxy) and decremented when all
> > > @@ -91,6 +102,7 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
> > >  
> > >  void __ns_common_free(struct ns_common *ns)
> > >  {
> > > +	security_namespace_free(ns);
> > >  	proc_free_inum(ns->inum);
> > >  }
> > >  
> > > diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> > > index 259c4b4f1eeb..f0b30d1907e7 100644
> > > --- a/kernel/nsproxy.c
> > > +++ b/kernel/nsproxy.c
> > > @@ -379,7 +379,13 @@ static int prepare_nsset(unsigned flags, struct nsset *nsset)
> > >  
> > >  static inline int validate_ns(struct nsset *nsset, struct ns_common *ns)
> > >  {
> > > -	return ns->ops->install(nsset, ns);
> > > +	int ret;
> > > +
> > > +	ret = ns->ops->install(nsset, ns);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	return security_namespace_install(nsset, ns);
> > 
> > In my local tree I had that moved before the ->install() and I think
> > that's the correct thing to do. So please switch to that.
> 
> Looks good, I'll include your fixes in the next version.

Thanks!

> 
> > 
> > The rest looks good to me, thanks.
> 
> Another issue raised by Daniel Durning [1] is freeing of anonymous
> namespaces.
> 
> I'll extend this patch with this new hunk if that's ok:
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 854f4fc66469..f6977e59be7d 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -4186,6 +4186,8 @@ static void free_mnt_ns(struct mnt_namespace *ns)
>  {
>         if (!is_anon_ns(ns))
>                 ns_common_free(ns);
> +       else
> +               security_namespace_free(&ns->ns);
>         dec_mnt_namespaces(ns->ucounts);
>         mnt_ns_tree_remove(ns);
>  }

I think that's fixing it at the wrong layer. It's probably better to do
sm like:

diff --git a/include/uapi/linux/nsfs.h b/include/uapi/linux/nsfs.h
index a25e38d1c874..ea0f0267d90f 100644
--- a/include/uapi/linux/nsfs.h
+++ b/include/uapi/linux/nsfs.h
@@ -55,6 +55,7 @@ enum init_ns_ino {
        MNT_NS_INIT_INO         = 0xEFFFFFF8U,
 #ifdef __KERNEL__
        MNT_NS_ANON_INO         = 0xEFFFFFF7U,
+       MNT_NS_INO_SPECIAL_MAX  = MNT_NS_ANON_INO,
 #endif
 };

diff --git a/kernel/nscommon.c b/kernel/nscommon.c
index 3166c1fd844a..e7a3dd2189cc 100644
--- a/kernel/nscommon.c
+++ b/kernel/nscommon.c
@@ -91,7 +91,10 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope

 void __ns_common_free(struct ns_common *ns)
 {
-       proc_free_inum(ns->inum);
+       security_namespace_free(&ns->ns);
+
+       if (ns->inum > MNT_NS_INO_SPECIAL_MAX)
+               proc_free_inum(ns->inum);
 }

 struct ns_common *__must_check ns_owner(struct ns_common *ns)

> 
> Daniel, could you please confirm that this fixes the memory leak?
> 
> [1] https://lore.kernel.org/all/20260330193100.3603-1-danieldurning.work@gmail.com/
> 
> 
> > > +/**
> > > + * security_namespace_free() - Release LSM security data from a namespace
> > > + * @ns: the namespace being freed
> > > + *
> > > + * Release security data attached to the namespace. Called before the
> > > + * namespace structure is freed.
> > > + *
> > > + * Note: The namespace may be freed via kfree_rcu(). LSMs must use
> > > + * RCU-safe freeing for any data that might be accessed by concurrent
> > > + * RCU readers.
> > > + */
> > > +void security_namespace_free(struct ns_common *ns)
> > > +{
> > > +       if (!ns->ns_security)
> > > +               return;
> > > +
> > > +       call_void_hook(namespace_free, ns);
> > > +
> 
> > > +       kfree(ns->ns_security);
> > > +       ns->ns_security = NULL;
> 
> I think it would be safer to replace these two lines with:
> kfree_rcu_mightsleep(ns->ns_security)
> 
> > > +}

^ permalink raw reply related

* Re: [PATCH] security: remove BUG_ON in security_skb_classify_flow
From: Jiayuan Chen @ 2026-04-10  1:56 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-security-module, paul, jmorris, linux-kernel, Kaiyan Mei,
	Yinhao Hu, Dongliang Mu
In-Reply-To: <adhLQDIILT/sHpzL@mail.hallyn.com>


On 4/10/26 8:58 AM, Serge E. Hallyn wrote:
> On Wed, Apr 08, 2026 at 07:42:57PM +0800, Jiayuan Chen wrote:
>> A BPF program attached to the xfrm_decode_session hook can return a
>> non-zero value, which causes BUG_ON(rc) in security_skb_classify_flow()
>> to trigger a kernel panic.
> It would seem worth it to have pointed at the previous discussion at
>
> https://lore.kernel.org/all/CAEjxPJ5aA01in+Z1yLF1cwe-3uqL_E8SKGK4J294D5eRG5__5Q@mail.gmail.com/
>
> Based on that, I guess this is probably ok, but still,
>
>> Remove the BUG_ON and change the return type from void to int, so that
>> callers can optionally handle the error.
> but you don't have the existing callers handling the error.  It's
> conceivable they won't care, but it's also possible that they were
> counting on a BUG_ON in that case.
>
> What *should* callers (icmp_reply, etc) do if an error code is
> returned?  Should they ignore it?  In that case, would it be
> better to change security_skb_classify_flow() to return void?
>
Thanks for your pointer.

So I think Feng's patch is sufficient and can by applied ?


^ permalink raw reply

* Re: [RFC PATCH v1 05/11] landlock: Enforce namespace entry restrictions
From: Tingmao Wang @ 2026-04-10  1:45 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Christian Brauner, Paul Moore, Serge E . Hallyn, Justin Suess,
	Lennart Poettering, Mikhail Ivanov, Nicolas Bouchinet,
	Shervin Oloumi, kernel-team, linux-fsdevel, linux-kernel,
	linux-security-module
In-Reply-To: <20260312100444.2609563-6-mic@digikod.net>

On 3/12/26 10:04, Mickaël Salaün wrote:
> [...]
> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> index f88fa1f68b77..b76e656241df 100644
> --- a/include/uapi/linux/landlock.h
> +++ b/include/uapi/linux/landlock.h
> @@ -51,6 +51,14 @@ struct landlock_ruleset_attr {
>  	 * resources (e.g. IPCs).
>  	 */
>  	__u64 scoped;
> +	/**
> +	 * @handled_perm: Bitmask of permissions (cf. `Permission flags`_)
> +	 * that this ruleset handles.  Each permission controls a broad
> +	 * operation enforced at a kernel chokepoint: all instances of
> +	 * that operation are denied unless explicitly allowed by a rule.
> +	 * See Documentation/security/landlock.rst for the rationale.
> +	 */
> +	__u64 handled_perm;
>  };
>  
>  /**
> @@ -153,6 +161,11 @@ enum landlock_rule_type {
>  	 * landlock_net_port_attr .
>  	 */
>  	LANDLOCK_RULE_NET_PORT,
> +	/**
> +	 * @LANDLOCK_RULE_NAMESPACE: Type of a &struct
> +	 * landlock_namespace_attr .
> +	 */
> +	LANDLOCK_RULE_NAMESPACE,
>  };
>  
>  /**
> @@ -206,6 +219,24 @@ struct landlock_net_port_attr {
>  	__u64 port;
>  };
>  
> +/**
> + * struct landlock_namespace_attr - Namespace type definition
> + *
> + * Argument of sys_landlock_add_rule() with %LANDLOCK_RULE_NAMESPACE.
> + */
> +struct landlock_namespace_attr {
> +	/**
> +	 * @allowed_perm: Must be set to %LANDLOCK_PERM_NAMESPACE_ENTER.
> +	 */
> +	__u64 allowed_perm;
> +	/**
> +	 * @namespace_types: Bitmask of namespace types (``CLONE_NEW*`` flags)
> +	 * that should be allowed to be entered under this rule.  Unknown bits
> +	 * are silently ignored for forward compatibility.
> +	 */
> +	__u64 namespace_types;
> +};
> +
>  /**
>   * DOC: fs_access
>   *

This UAPI looks good, follows existing patterns and is extensible.

btw, I guess for consistency, later on this new handled_perm should also
have a quiet_perm, which would allow suppressing audit logs for namespace
/ capability rules (for those (possibly a subset) added with
LANDLOCK_ADD_RULE_QUIET)?

> [...]
> @@ -153,6 +153,48 @@ landlock_get_applicable_subject(const struct cred *const cred,
>  	return NULL;
>  }
>  
> +/**
> + * landlock_perm_is_denied - Check if a permission bitmask request is denied
> + *
> + * @domain: The enforced domain.
> + * @perm_bit: The LANDLOCK_PERM_* flag to check.
> + * @request_value: Compact bitmask to look for (e.g. result of
> + *                 ``landlock_ns_type_to_bit(CLONE_NEWNET)``).
> + *
> + * Iterate from the youngest layer to the oldest.  For each layer that

How about this:

/**
 * landlock_perm_is_denied - Check if a permission request is denied
 *
 * @domain: The enforced domain.
 * @perm_bit: The LANDLOCK_PERM_* flag to check.
 * @request_value: Compact bitmask to look for (e.g. result of
 *                 ``landlock_ns_type_to_bit(CLONE_NEWNET)``).
 *                 Must have only bit set.
 *
 * Iterate from the youngest layer to the oldest.  For each layer that

Basically, to make it more obvious that this functions only checks one
bit.  Currently if a combination of permission bits are passed, this
allows access if any of them are allowed, which if accidentally used this
way in the future will probably be a bug.  I was considering a
WARN_ON_ONCE but maybe it's a bit unnecessary for now given the caller
always passes a landlock_*_to_bit result (and those already WARN_ON_ONCE
if given invalid parameter).

Reviewed-by: Tingmao Wang <m@maowtm.org>

^ permalink raw reply

* Re: [RFC PATCH v1 04/11] landlock: Wrap per-layer access masks in struct layer_rights
From: Tingmao Wang @ 2026-04-10  1:45 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Günther Noack, Paul Moore,
	Serge E . Hallyn, Justin Suess, Lennart Poettering,
	Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, kernel-team,
	linux-fsdevel, linux-kernel, linux-security-module
In-Reply-To: <20260312100444.2609563-5-mic@digikod.net>

On 3/12/26 10:04, Mickaël Salaün wrote:
> [...]

Hi Mickaël,

As requested I have reviewed this series.  All looks good to me with one
minor comment on the next patch.

(for patch 4,5,6,10)
Reviewed-by: Tingmao Wang <m@maowtm.org>

^ permalink raw reply

* Re: [PATCH] security: remove BUG_ON in security_skb_classify_flow
From: Serge E. Hallyn @ 2026-04-10  0:58 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: linux-security-module, paul, jmorris, serge, linux-kernel,
	Kaiyan Mei, Yinhao Hu, Dongliang Mu
In-Reply-To: <20260408114257.298500-1-jiayuan.chen@linux.dev>

On Wed, Apr 08, 2026 at 07:42:57PM +0800, Jiayuan Chen wrote:
> A BPF program attached to the xfrm_decode_session hook can return a
> non-zero value, which causes BUG_ON(rc) in security_skb_classify_flow()
> to trigger a kernel panic.

It would seem worth it to have pointed at the previous discussion at

https://lore.kernel.org/all/CAEjxPJ5aA01in+Z1yLF1cwe-3uqL_E8SKGK4J294D5eRG5__5Q@mail.gmail.com/

Based on that, I guess this is probably ok, but still,

> Remove the BUG_ON and change the return type from void to int, so that
> callers can optionally handle the error.

but you don't have the existing callers handling the error.  It's
conceivable they won't care, but it's also possible that they were
counting on a BUG_ON in that case.

What *should* callers (icmp_reply, etc) do if an error code is
returned?  Should they ignore it?  In that case, would it be
better to change security_skb_classify_flow() to return void?

> Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
> Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
> Reported-by: Dongliang Mu <dzm91@hust.edu.cn>
> Closes: https://lore.kernel.org/bpf/4c4d04ba.6c12b.19c039b69e6.Coremail.kaiyanm@hust.edu.cn/
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>  include/linux/security.h |  7 ++++---
>  security/security.c      | 16 +++++++++++-----
>  2 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/security.h b/include/linux/security.h
> index ee88dd2d2d1f..6d210dc4c649 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -1975,7 +1975,7 @@ int security_xfrm_state_pol_flow_match(struct xfrm_state *x,
>  				       struct xfrm_policy *xp,
>  				       const struct flowi_common *flic);
>  int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid);
> -void security_skb_classify_flow(struct sk_buff *skb, struct flowi_common *flic);
> +int security_skb_classify_flow(struct sk_buff *skb, struct flowi_common *flic);
>  
>  #else	/* CONFIG_SECURITY_NETWORK_XFRM */
>  
> @@ -2038,9 +2038,10 @@ static inline int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid)
>  	return 0;
>  }
>  
> -static inline void security_skb_classify_flow(struct sk_buff *skb,
> -					      struct flowi_common *flic)
> +static inline int security_skb_classify_flow(struct sk_buff *skb,
> +					     struct flowi_common *flic)
>  {
> +	return 0;
>  }
>  
>  #endif	/* CONFIG_SECURITY_NETWORK_XFRM */
> diff --git a/security/security.c b/security/security.c
> index a26c1474e2e4..26a34eb363c2 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -4990,12 +4990,18 @@ int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid)
>  	return call_int_hook(xfrm_decode_session, skb, secid, 1);
>  }
>  
> -void security_skb_classify_flow(struct sk_buff *skb, struct flowi_common *flic)
> +/**
> + * security_skb_classify_flow() - Set the flow's secid from the security label
> + * @skb: packet
> + * @flic: flow common structure to set
> + *
> + * Decode the packet in @skb and set the flow's secid in @flic.
> + *
> + * Return: Return 0 if successful.
> + */
> +int security_skb_classify_flow(struct sk_buff *skb, struct flowi_common *flic)
>  {
> -	int rc = call_int_hook(xfrm_decode_session, skb, &flic->flowic_secid,
> -			       0);
> -
> -	BUG_ON(rc);
> +	return call_int_hook(xfrm_decode_session, skb, &flic->flowic_secid, 0);
>  }
>  EXPORT_SYMBOL(security_skb_classify_flow);
>  #endif	/* CONFIG_SECURITY_NETWORK_XFRM */
> -- 
> 2.43.0

^ permalink raw reply

* Re: [PATCH v2 0/4] Firmware LSM hook
From: Paul Moore @ 2026-04-09 21:04 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Roberto Sassu, KP Singh, Matt Bobrowski, Alexei Starovoitov,
	Daniel Borkmann, John Fastabend, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Jason Gunthorpe, Saeed Mahameed, Itay Avraham, Dave Jiang,
	Jonathan Cameron, bpf, linux-kernel, linux-kselftest, linux-rdma,
	Chiara Meiohas, Maher Sanalla, linux-security-module
In-Reply-To: <20260409124553.GB720371@unreal>

On Thu, Apr 9, 2026 at 8:45 AM Leon Romanovsky <leon@kernel.org> wrote:
> On Thu, Apr 09, 2026 at 02:27:43PM +0200, Roberto Sassu wrote:
> > On Thu, 2026-04-09 at 15:12 +0300, Leon Romanovsky wrote:
> > > On Tue, Mar 31, 2026 at 08:56:32AM +0300, Leon Romanovsky wrote:
> > > > From Chiara:
> > > >
> > > > This patch set introduces a new BPF LSM hook to validate firmware commands
> > > > triggered by userspace before they are submitted to the device. The hook
> > > > runs after the command buffer is constructed, right before it is sent
> > > > to firmware.
> > >
> > > <...>
> > >
> > > > ---
> > > > Chiara Meiohas (4):
> > > >       bpf: add firmware command validation hook
> > > >       selftests/bpf: add test cases for fw_validate_cmd hook
> > > >       RDMA/mlx5: Externally validate FW commands supplied in DEVX interface
> > > >       fwctl/mlx5: Externally validate FW commands supplied in fwctl
> > >
> > > Hi,
> > >
> > > Can we get Ack from BPF/LSM side?
> >
> > + Paul, linux-security-module ML
> >
> > Hi
> >
> > probably you also want to get an Ack from the LSM maintainer (added in
> > CC with the list). Most likely, he will also ask you to create the
> > security_*() functions counterparts of the BPF hooks.
>
> We implemented this approach in v1:
> https://patch.msgid.link/20260309-fw-lsm-hook-v1-0-4a6422e63725@nvidia.com
> and were advised to pursue a different direction.

I'm assuming you are referring to my comments?  If so, that isn't
exactly what I said, I mentioned at least one other option besides
going directly to BPF.  Ultimately, it is your choice to decide how
you want to proceed, but to claim I advised you to avoid a LSM based
solution isn't strictly correct.

Regardless, looking at your v2 patchset, it looks like you've taken an
unusual approach of using some of the LSM mechanisms, e.g. LSM_HOOK(),
but not actually exposing a LSM hook with proper callbacks.
Unfortunately, that's not something we want to support.  If you want
to pursue an LSM based solution, complete with a security_XXX() hook,
use of LSM_HOOK() macros, etc. then that's fine, I'm happy to work
with you on that.  However, if you've decided that your preferred
option is to create a BPF hook you should avoid using things like
LSM_HOOK() and locating your hook/code in bpf_lsm.c.

The good news is that there are plenty of other examples of BPF
plugable code that you could use as an example, one such thing is the
update_socket_protocol() BPF hook that was originally proposed as a
LSM hook, but moved to a dedicated BPF hook as we generally want to
avoid changing non-LSM kernel objects within the scope of the LSMs.
While your proposed case is slightly different, I think the basic idea
and mechanism should still be useful.

https://lore.kernel.org/all/cover.1692147782.git.geliang.tang@suse.com

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH 00/61] treewide: Use IS_ERR_OR_NULL over manual NULL check - refactor
From: Al Viro @ 2026-04-09 18:16 UTC (permalink / raw)
  To: Philipp Hahn
  Cc: amd-gfx, apparmor, bpf, ceph-devel, cocci, dm-devel, dri-devel,
	gfs2, intel-gfx, intel-wired-lan, iommu, kvm, linux-arm-kernel,
	linux-block, linux-bluetooth, linux-btrfs, linux-cifs, linux-clk,
	linux-erofs, linux-ext4, linux-fsdevel, linux-gpio, linux-hyperv,
	linux-input, linux-kernel, linux-leds, linux-media, linux-mips,
	linux-mm, linux-modules, linux-mtd, linux-nfs, linux-omap,
	linux-phy, linux-pm, linux-rockchip, linux-s390, linux-scsi,
	linux-sctp, linux-security-module, linux-sh, linux-sound,
	linux-stm32, linux-trace-kernel, linux-usb, linux-wireless,
	netdev, ntfs3, samba-technical, sched-ext, target-devel,
	tipc-discussion, v9fs, Julia Lawall, Nicolas Palix, Chris Mason,
	David Sterba, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Theodore Ts'o, Andreas Dilger, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Bharath SM,
	Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet,
	Christian Schoenebeck, Gao Xiang, Chao Yu, Yue Hu, Jeffle Xu,
	Sandeep Dhavale, Hongbo Li, Chunhai Guo, Miklos Szeredi,
	Konstantin Komarov, Andreas Gruenbacher, Kees Cook, Tony Luck,
	Guilherme G. Piccoli, Jan Kara, Phillip Lougher,
	Christian Brauner, Jan Kara, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Tejun Heo, David Vernet, Andrea Righi,
	Changwoo Min, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Ben Segall, Mel Gorman,
	Valentin Schneider, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
	Sami Tolvanen, Aaron Tomlin, Sylwester Nawrocki, Liam Girdwood,
	Mark Brown, Jaroslav Kysela, Takashi Iwai, Max Filippov,
	Paolo Bonzini, John Johansen, Paul Moore, James Morris,
	Serge E. Hallyn, Andrew Morton, Alasdair Kergon, Mike Snitzer,
	Mikulas Patocka, Benjamin Marzinski, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Marcel Holtmann, Johan Hedberg, Luiz Augusto von Dentz,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Jamal Hadi Salim, Jiri Pirko,
	Marcelo Ricardo Leitner, Xin Long, Trond Myklebust,
	Anna Schumaker, Chuck Lever, Jeff Layton, NeilBrown,
	Olga Kornievskaia, Dai Ngo, Jon Maloy, Johannes Berg,
	Catalin Marinas, Russell King, John Crispin, Thomas Bogendoerfer,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
	Jonas Karlman, Jernej Skrabec, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Zhenyu Wang,
	Zhi Wang, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Tvrtko Ursulin, Alex Deucher, Christian König, Sandy Huang,
	Heiko Stübner, Andy Yan, Igor Russkikh, Andrew Lunn,
	Pavan Chebbi, Michael Chan, Potnuri Bharat Teja, Tony Nguyen,
	Przemek Kitszel, Taras Chornyi, Maxime Coquelin, Alexandre Torgue,
	Iyappan Subramanian, Keyur Chudgar, Quan Nguyen, Heiner Kallweit,
	Marc Zyngier, Thomas Gleixner, Andrew Lunn, Gregory Clement,
	Sebastian Hesselbarth, Vinod Koul, Linus Walleij, Ulf Hansson,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Martin K. Petersen,
	Eduardo Valentin, Keerthy, Rafael J. Wysocki, Daniel Lezcano,
	Zhang Rui, Lukasz Luba, Alex Williamson, Mark Greer,
	Miquel Raynal, Richard Weinberger, Vignesh Raghavendra,
	Shuah Khan, Kieran Bingham, Mauro Carvalho Chehab, Joerg Roedel,
	Will Deacon, Robin Murphy, Lee Jones, Pavel Machek, Dave Penkler,
	K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Justin Sanders, Jens Axboe, Georgi Djakov, Michael Turquette,
	Stephen Boyd, Philipp Zabel, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Pali Rohár, Dmitry Torokhov
In-Reply-To: <20260310-b4-is_err_or_null-v1-0-bd63b656022d@avm.de>

On Tue, Mar 10, 2026 at 12:48:26PM +0100, Philipp Hahn wrote:
> While doing some static code analysis I stumbled over a common pattern,
> where IS_ERR() is combined with a NULL check. For that there is
> IS_ERR_OR_NULL().

... and valid uses of IS_ERR_OR_NULL are rare as hen teeth.
Most of those are "I'm not sure how this function returns an
error, let's use that just in case".

Please, do not introduce more of that crap.

^ permalink raw reply

* [GIT PULL] Landlock update for v7.1-rc1
From: Mickaël Salaün @ 2026-04-09 17:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mickaël Salaün, Georgia Garcia, Günther Noack,
	Günther Noack, Jann Horn, Justin Suess, Paul Moore,
	Sebastian Andrzej Siewior, linux-kernel, linux-security-module

Hi,

This PR adds a new Landlock access right for pathname UNIX domain
sockets thanks to a new LSM hook, and a few fixes.

Please pull these changes for v7.1-rc1 .  These commits merge cleanly
with your master branch.  Kernel changes have been tested in the latest
linux-next releases for some weeks, and since this week for the
LOG_SUBDOMAINS_OFF fixes.

Test coverage for security/landlock is 91.1% of 2152 lines according to
LLVM 21, and it was 91.0% of 2105 lines before this PR.

Regards,
 Mickaël

--
The following changes since commit 7aaa8047eafd0bd628065b15757d9b48c5f9c07d:

  Linux 7.0-rc6 (2026-03-29 15:40:00 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git tags/landlock-7.1-rc1

for you to fetch changes up to 3457a5ccacd34fdd5ebd3a4745e721b5a1239690:

  landlock: Document fallocate(2) as another truncation corner case (2026-04-07 18:51:11 +0200)

----------------------------------------------------------------
Landlock update for v7.1-rc1

----------------------------------------------------------------
Günther Noack (11):
      landlock: Use mem_is_zero() in is_layer_masks_allowed()
      landlock: Control pathname UNIX domain socket resolution by path
      landlock: Clarify BUILD_BUG_ON check in scoping logic
      samples/landlock: Add support for named UNIX domain socket restrictions
      selftests/landlock: Replace access_fs_16 with ACCESS_ALL in fs_test
      selftests/landlock: Test LANDLOCK_ACCESS_FS_RESOLVE_UNIX
      selftests/landlock: Audit test for LANDLOCK_ACCESS_FS_RESOLVE_UNIX
      selftests/landlock: Check that coredump sockets stay unrestricted
      selftests/landlock: Simplify ruleset creation and enforcement in fs_test
      landlock: Document FS access right for pathname UNIX sockets
      landlock: Document fallocate(2) as another truncation corner case

Justin Suess (1):
      lsm: Add LSM hook security_unix_find

Mickaël Salaün (11):
      landlock: Fix LOG_SUBDOMAINS_OFF inheritance across fork()
      landlock: Allow TSYNC with LOG_SUBDOMAINS_OFF and fd=-1
      selftests/landlock: Fix snprintf truncation checks in audit helpers
      selftests/landlock: Fix socket file descriptor leaks in audit helpers
      selftests/landlock: Drain stale audit records on init
      selftests/landlock: Skip stale records in audit_match_record()
      selftests/landlock: Fix format warning for __u64 in net_test
      landlock: Add missing kernel-doc "Return:" sections
      landlock: Improve kernel-doc "Return:" section consistency
      landlock: Fix formatting in tsync.c
      landlock: Fix kernel-doc warning for pointer-to-array parameters

 Documentation/security/landlock.rst                |   42 +-
 Documentation/userspace-api/landlock.rst           |   22 +-
 include/linux/lsm_hook_defs.h                      |    5 +
 include/linux/security.h                           |   11 +
 include/uapi/linux/landlock.h                      |   25 +-
 net/unix/af_unix.c                                 |   10 +-
 samples/landlock/sandboxer.c                       |   12 +-
 security/landlock/access.h                         |    4 +-
 security/landlock/audit.c                          |    1 +
 security/landlock/cred.c                           |    6 +-
 security/landlock/cred.h                           |    2 +-
 security/landlock/domain.c                         |    6 +-
 security/landlock/fs.c                             |  163 ++-
 security/landlock/id.c                             |    2 +-
 security/landlock/limits.h                         |    2 +-
 security/landlock/ruleset.c                        |   14 +-
 security/landlock/ruleset.h                        |    2 +-
 security/landlock/syscalls.c                       |   33 +-
 security/landlock/task.c                           |   22 +-
 security/landlock/tsync.c                          |  124 +-
 security/security.c                                |   20 +
 tools/testing/selftests/landlock/audit.h           |  133 +-
 tools/testing/selftests/landlock/audit_test.c      |  357 +++++-
 tools/testing/selftests/landlock/base_test.c       |    2 +-
 tools/testing/selftests/landlock/fs_test.c         | 1343 +++++++++++---------
 tools/testing/selftests/landlock/net_test.c        |    2 +-
 tools/testing/selftests/landlock/ptrace_test.c     |    1 -
 .../selftests/landlock/scoped_abstract_unix_test.c |    1 -
 tools/testing/selftests/landlock/tsync_test.c      |   77 ++
 29 files changed, 1650 insertions(+), 794 deletions(-)

^ permalink raw reply

* Re: [RFC PATCH v1 01/11] security: add LSM blob and hooks for namespaces
From: Mickaël Salaün @ 2026-04-09 16:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Günther Noack, Paul Moore, Serge E . Hallyn, Justin Suess,
	Lennart Poettering, Mikhail Ivanov, Nicolas Bouchinet,
	Shervin Oloumi, Tingmao Wang, kernel-team, linux-fsdevel,
	linux-kernel, linux-security-module, Daniel Durning
In-Reply-To: <20260325-filmverleih-auffressen-e897fcf8d3f2@brauner>

On Wed, Mar 25, 2026 at 01:31:30PM +0100, Christian Brauner wrote:
> On Thu, Mar 12, 2026 at 11:04:34AM +0100, Mickaël Salaün wrote:
> > From: Christian Brauner <brauner@kernel.org>
> > 
> > All namespace types now share the same ns_common infrastructure. Extend
> > this to include a security blob so LSMs can start managing namespaces
> > uniformly without having to add one-off hooks or security fields to
> > every individual namespace type.
> > 
> > Add a ns_security pointer to ns_common and the corresponding lbs_ns
> > blob size to lsm_blob_sizes. Allocation and freeing hooks are called
> > from the common __ns_common_init() and __ns_common_free() paths so
> > every namespace type gets covered in one go. All information about the
> > namespace type and the appropriate casting helpers to get at the
> > containing namespace are available via ns_common making it
> > straightforward for LSMs to differentiate when they need to.
> > 
> > A namespace_install hook is called from validate_ns() during setns(2)
> > giving LSMs a chance to enforce policy on namespace transitions.
> > 
> > Individual namespace types can still have their own specialized security
> > hooks when needed. This is just the common baseline that makes it easy
> > to track and manage namespaces from the security side without requiring
> > every namespace type to reinvent the wheel.
> > 
> > Cc: Günther Noack <gnoack@google.com>
> > Cc: Paul Moore <paul@paul-moore.com>
> > Cc: Serge E. Hallyn <serge@hallyn.com>
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > Link: https://lore.kernel.org/r/20260216-work-security-namespace-v1-1-075c28758e1f@kernel.org
> > ---
> >  include/linux/lsm_hook_defs.h      |  3 ++
> >  include/linux/lsm_hooks.h          |  1 +
> >  include/linux/ns/ns_common_types.h |  3 ++
> >  include/linux/security.h           | 20 ++++++++
> >  kernel/nscommon.c                  | 12 +++++
> >  kernel/nsproxy.c                   |  8 +++-
> >  security/lsm_init.c                |  2 +
> >  security/security.c                | 76 ++++++++++++++++++++++++++++++
> >  8 files changed, 124 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> > index 8c42b4bde09c..fefd3aa6d8f4 100644
> > --- a/include/linux/lsm_hook_defs.h
> > +++ b/include/linux/lsm_hook_defs.h
> > @@ -260,6 +260,9 @@ LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2,
> >  LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p,
> >  	 struct inode *inode)
> >  LSM_HOOK(int, 0, userns_create, const struct cred *cred)
> > +LSM_HOOK(int, 0, namespace_alloc, struct ns_common *ns)
> > +LSM_HOOK(void, LSM_RET_VOID, namespace_free, struct ns_common *ns)
> > +LSM_HOOK(int, 0, namespace_install, const struct nsset *nsset, struct ns_common *ns)
> >  LSM_HOOK(int, 0, ipc_permission, struct kern_ipc_perm *ipcp, short flag)
> >  LSM_HOOK(void, LSM_RET_VOID, ipc_getlsmprop, struct kern_ipc_perm *ipcp,
> >  	 struct lsm_prop *prop)
> > diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> > index d48bf0ad26f4..3e7afe76e86c 100644
> > --- a/include/linux/lsm_hooks.h
> > +++ b/include/linux/lsm_hooks.h
> > @@ -111,6 +111,7 @@ struct lsm_blob_sizes {
> >  	unsigned int lbs_ipc;
> >  	unsigned int lbs_key;
> >  	unsigned int lbs_msg_msg;
> > +	unsigned int lbs_ns;
> >  	unsigned int lbs_perf_event;
> >  	unsigned int lbs_task;
> >  	unsigned int lbs_xattr_count; /* num xattr slots in new_xattrs array */
> > diff --git a/include/linux/ns/ns_common_types.h b/include/linux/ns/ns_common_types.h
> > index 0014fbc1c626..170288e2e895 100644
> > --- a/include/linux/ns/ns_common_types.h
> > +++ b/include/linux/ns/ns_common_types.h
> > @@ -115,6 +115,9 @@ struct ns_common {
> >  	struct dentry *stashed;
> >  	const struct proc_ns_operations *ops;
> >  	unsigned int inum;
> > +#ifdef CONFIG_SECURITY
> > +	void *ns_security;
> > +#endif
> >  	union {
> >  		struct ns_tree;
> >  		struct rcu_head ns_rcu;
> > diff --git a/include/linux/security.h b/include/linux/security.h
> > index 83a646d72f6f..611b9098367d 100644
> > --- a/include/linux/security.h
> > +++ b/include/linux/security.h
> > @@ -67,6 +67,7 @@ enum fs_value_type;
> >  struct watch;
> >  struct watch_notification;
> >  struct lsm_ctx;
> > +struct nsset;
> >  
> >  /* Default (no) options for the capable function */
> >  #define CAP_OPT_NONE 0x0
> > @@ -80,6 +81,7 @@ struct lsm_ctx;
> >  
> >  struct ctl_table;
> >  struct audit_krule;
> > +struct ns_common;
> >  struct user_namespace;
> >  struct timezone;
> >  
> > @@ -533,6 +535,9 @@ int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >  			unsigned long arg4, unsigned long arg5);
> >  void security_task_to_inode(struct task_struct *p, struct inode *inode);
> >  int security_create_user_ns(const struct cred *cred);
> > +int security_namespace_alloc(struct ns_common *ns);
> > +void security_namespace_free(struct ns_common *ns);
> > +int security_namespace_install(const struct nsset *nsset, struct ns_common *ns);
> >  int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag);
> >  void security_ipc_getlsmprop(struct kern_ipc_perm *ipcp, struct lsm_prop *prop);
> >  int security_msg_msg_alloc(struct msg_msg *msg);
> > @@ -1407,6 +1412,21 @@ static inline int security_create_user_ns(const struct cred *cred)
> >  	return 0;
> >  }
> >  
> > +static inline int security_namespace_alloc(struct ns_common *ns)
> > +{
> > +	return 0;
> > +}
> > +
> > +static inline void security_namespace_free(struct ns_common *ns)
> > +{
> > +}
> > +
> > +static inline int security_namespace_install(const struct nsset *nsset,
> > +					     struct ns_common *ns)
> > +{
> > +	return 0;
> > +}
> > +
> >  static inline int security_ipc_permission(struct kern_ipc_perm *ipcp,
> >  					  short flag)
> >  {
> > diff --git a/kernel/nscommon.c b/kernel/nscommon.c
> > index bdc3c86231d3..de774e374f9d 100644
> > --- a/kernel/nscommon.c
> > +++ b/kernel/nscommon.c
> > @@ -4,6 +4,7 @@
> >  #include <linux/ns_common.h>
> >  #include <linux/nstree.h>
> >  #include <linux/proc_ns.h>
> > +#include <linux/security.h>
> >  #include <linux/user_namespace.h>
> >  #include <linux/vfsdebug.h>
> >  
> > @@ -59,6 +60,9 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
> >  
> >  	refcount_set(&ns->__ns_ref, 1);
> >  	ns->stashed = NULL;
> > +#ifdef CONFIG_SECURITY
> > +	ns->ns_security = NULL;
> > +#endif
> >  	ns->ops = ops;
> >  	ns->ns_id = 0;
> >  	ns->ns_type = ns_type;
> > @@ -77,6 +81,13 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
> >  		ret = proc_alloc_inum(&ns->inum);
> >  	if (ret)
> >  		return ret;
> > +
> > +	ret = security_namespace_alloc(ns);
> > +	if (ret) {
> > +		proc_free_inum(ns->inum);
> 
> ret = security_namespace_alloc(ns);
> if (ret && !inum)
>         proc_free_inum(ns->inum);
> return ret;
> 
> 
> > +		return ret;
> > +	}
> > +
> >  	/*
> >  	 * Tree ref starts at 0. It's incremented when namespace enters
> >  	 * active use (installed in nsproxy) and decremented when all
> > @@ -91,6 +102,7 @@ int __ns_common_init(struct ns_common *ns, u32 ns_type, const struct proc_ns_ope
> >  
> >  void __ns_common_free(struct ns_common *ns)
> >  {
> > +	security_namespace_free(ns);
> >  	proc_free_inum(ns->inum);
> >  }
> >  
> > diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> > index 259c4b4f1eeb..f0b30d1907e7 100644
> > --- a/kernel/nsproxy.c
> > +++ b/kernel/nsproxy.c
> > @@ -379,7 +379,13 @@ static int prepare_nsset(unsigned flags, struct nsset *nsset)
> >  
> >  static inline int validate_ns(struct nsset *nsset, struct ns_common *ns)
> >  {
> > -	return ns->ops->install(nsset, ns);
> > +	int ret;
> > +
> > +	ret = ns->ops->install(nsset, ns);
> > +	if (ret)
> > +		return ret;
> > +
> > +	return security_namespace_install(nsset, ns);
> 
> In my local tree I had that moved before the ->install() and I think
> that's the correct thing to do. So please switch to that.

Looks good, I'll include your fixes in the next version.

> 
> The rest looks good to me, thanks.

Another issue raised by Daniel Durning [1] is freeing of anonymous
namespaces.

I'll extend this patch with this new hunk if that's ok:

diff --git a/fs/namespace.c b/fs/namespace.c
index 854f4fc66469..f6977e59be7d 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4186,6 +4186,8 @@ static void free_mnt_ns(struct mnt_namespace *ns)
 {
        if (!is_anon_ns(ns))
                ns_common_free(ns);
+       else
+               security_namespace_free(&ns->ns);
        dec_mnt_namespaces(ns->ucounts);
        mnt_ns_tree_remove(ns);
 }

Daniel, could you please confirm that this fixes the memory leak?

[1] https://lore.kernel.org/all/20260330193100.3603-1-danieldurning.work@gmail.com/


> > +/**
> > + * security_namespace_free() - Release LSM security data from a namespace
> > + * @ns: the namespace being freed
> > + *
> > + * Release security data attached to the namespace. Called before the
> > + * namespace structure is freed.
> > + *
> > + * Note: The namespace may be freed via kfree_rcu(). LSMs must use
> > + * RCU-safe freeing for any data that might be accessed by concurrent
> > + * RCU readers.
> > + */
> > +void security_namespace_free(struct ns_common *ns)
> > +{
> > +       if (!ns->ns_security)
> > +               return;
> > +
> > +       call_void_hook(namespace_free, ns);
> > +

> > +       kfree(ns->ns_security);
> > +       ns->ns_security = NULL;

I think it would be safer to replace these two lines with:
kfree_rcu_mightsleep(ns->ns_security)

> > +}

^ permalink raw reply related

* [PATCH v3] KEYS: trusted: Debugging as a feature
From: Jarkko Sakinen @ 2026-04-09 16:07 UTC (permalink / raw)
  To: linux-integrity, keyrings
  Cc: Jarkko Sakkinen, Srish Srinivasan, Nayna Jain, James Bottomley,
	Mimi Zohar, David Howells, Paul Moore, James Morris,
	Serge E. Hallyn, Ahmad Fatoum, Pengutronix Kernel Team,
	linux-kernel, linux-security-module

From: Jarkko Sakkinen <jarkko@kernel.org>

TPM_DEBUG, and other similar flags, are a non-standard way to specify a
feature in Linux kernel. Introduce CONFIG_TRUSTED_KEYS_DEBUG for trusted
keys, and use it to replace these ad-hoc feature flags.

Given that trusted keys debug dumps can contain sensitive data, harden the
feature as follows:

1. In the Kconfig description postulate that pr_debug() statements must be
   used.
2. Use pr_debug() statements in TPM 1.x driver to print the protocol dump.
3. Require trusted.debug=1 on the kernel command line (default: 0) to
   activate dumps at runtime, even when CONFIG_TRUSTED_KEYS_DEBUG=y.

Traces, when actually needed, can be easily enabled by providing
trusted.dyndbg='+p' and trusted.debug=1 in the kernel command-line.

Cc: Srish Srinivasan <ssrish@linux.ibm.com>
Reported-by: Nayna Jain <nayna@linux.ibm.com>
Closes: https://lore.kernel.org/all/7f8b8478-5cd8-4d97-bfd0-341fd5cf10f9@linux.ibm.com/
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
---
v3:
- Add kernel-command line option for enabling the traces.
- Add safety information to the Kconfig entry.
v2:
- Implement for all trusted keys backends.
- Add HAVE_TRUSTED_KEYS_DEBUG as it is a good practice despite full
  coverage.
---
 include/keys/trusted-type.h               | 21 ++++++-----
 security/keys/trusted-keys/Kconfig        | 23 ++++++++++++
 security/keys/trusted-keys/trusted_caam.c |  7 ++--
 security/keys/trusted-keys/trusted_core.c |  6 ++++
 security/keys/trusted-keys/trusted_tpm1.c | 44 +++++++++++++----------
 5 files changed, 71 insertions(+), 30 deletions(-)

diff --git a/include/keys/trusted-type.h b/include/keys/trusted-type.h
index 03527162613f..9f9940482da4 100644
--- a/include/keys/trusted-type.h
+++ b/include/keys/trusted-type.h
@@ -83,18 +83,21 @@ struct trusted_key_source {
 
 extern struct key_type key_type_trusted;
 
-#define TRUSTED_DEBUG 0
+#ifdef CONFIG_TRUSTED_KEYS_DEBUG
+extern bool trusted_debug;
 
-#if TRUSTED_DEBUG
 static inline void dump_payload(struct trusted_key_payload *p)
 {
-	pr_info("key_len %d\n", p->key_len);
-	print_hex_dump(KERN_INFO, "key ", DUMP_PREFIX_NONE,
-		       16, 1, p->key, p->key_len, 0);
-	pr_info("bloblen %d\n", p->blob_len);
-	print_hex_dump(KERN_INFO, "blob ", DUMP_PREFIX_NONE,
-		       16, 1, p->blob, p->blob_len, 0);
-	pr_info("migratable %d\n", p->migratable);
+	if (!trusted_debug)
+		return;
+
+	pr_debug("key_len %d\n", p->key_len);
+	print_hex_dump_debug("key ", DUMP_PREFIX_NONE,
+			     16, 1, p->key, p->key_len, 0);
+	pr_debug("bloblen %d\n", p->blob_len);
+	print_hex_dump_debug("blob ", DUMP_PREFIX_NONE,
+			     16, 1, p->blob, p->blob_len, 0);
+	pr_debug("migratable %d\n", p->migratable);
 }
 #else
 static inline void dump_payload(struct trusted_key_payload *p)
diff --git a/security/keys/trusted-keys/Kconfig b/security/keys/trusted-keys/Kconfig
index 9e00482d886a..c1ae7db1f612 100644
--- a/security/keys/trusted-keys/Kconfig
+++ b/security/keys/trusted-keys/Kconfig
@@ -1,10 +1,29 @@
 config HAVE_TRUSTED_KEYS
 	bool
 
+config HAVE_TRUSTED_KEYS_DEBUG
+	bool
+
+config TRUSTED_KEYS_DEBUG
+	bool "Debug trusted keys"
+	depends on HAVE_TRUSTED_KEYS_DEBUG
+	default n
+	help
+	  Trusted keys backends and core code that support debug traces can
+	  opt-in that feature here. Traces must only use debug level output, as
+	  sensitive data may pass by. In the kernel-command line traces can be
+	  enabled via trusted.dyndbg='+p'.
+
+	  SAFETY: Debug dumps are inactive at runtime until trusted.debug=1 is
+	  set on the kernel command-line. Use at your utmost consideration when
+	  enabling this feature on a production build. The general advice is not
+	  to do this.
+
 config TRUSTED_KEYS_TPM
 	bool "TPM-based trusted keys"
 	depends on TCG_TPM >= TRUSTED_KEYS
 	default y
+	select HAVE_TRUSTED_KEYS_DEBUG
 	select CRYPTO_HASH_INFO
 	select CRYPTO_LIB_SHA1
 	select CRYPTO_LIB_UTILS
@@ -23,6 +42,7 @@ config TRUSTED_KEYS_TEE
 	bool "TEE-based trusted keys"
 	depends on TEE >= TRUSTED_KEYS
 	default y
+	select HAVE_TRUSTED_KEYS_DEBUG
 	select HAVE_TRUSTED_KEYS
 	help
 	  Enable use of the Trusted Execution Environment (TEE) as trusted
@@ -33,6 +53,7 @@ config TRUSTED_KEYS_CAAM
 	depends on CRYPTO_DEV_FSL_CAAM_JR >= TRUSTED_KEYS
 	select CRYPTO_DEV_FSL_CAAM_BLOB_GEN
 	default y
+	select HAVE_TRUSTED_KEYS_DEBUG
 	select HAVE_TRUSTED_KEYS
 	help
 	  Enable use of NXP's Cryptographic Accelerator and Assurance Module
@@ -42,6 +63,7 @@ config TRUSTED_KEYS_DCP
 	bool "DCP-based trusted keys"
 	depends on CRYPTO_DEV_MXS_DCP >= TRUSTED_KEYS
 	default y
+	select HAVE_TRUSTED_KEYS_DEBUG
 	select HAVE_TRUSTED_KEYS
 	help
 	  Enable use of NXP's DCP (Data Co-Processor) as trusted key backend.
@@ -50,6 +72,7 @@ config TRUSTED_KEYS_PKWM
 	bool "PKWM-based trusted keys"
 	depends on PSERIES_PLPKS >= TRUSTED_KEYS
 	default y
+	select HAVE_TRUSTED_KEYS_DEBUG
 	select HAVE_TRUSTED_KEYS
 	help
 	  Enable use of IBM PowerVM Key Wrapping Module (PKWM) as a trusted key backend.
diff --git a/security/keys/trusted-keys/trusted_caam.c b/security/keys/trusted-keys/trusted_caam.c
index 601943ce0d60..6a33dbf2a7f5 100644
--- a/security/keys/trusted-keys/trusted_caam.c
+++ b/security/keys/trusted-keys/trusted_caam.c
@@ -28,10 +28,13 @@ static const match_table_t key_tokens = {
 	{opt_err, NULL}
 };
 
-#ifdef CAAM_DEBUG
+#ifdef CONFIG_TRUSTED_KEYS_DEBUG
 static inline void dump_options(const struct caam_pkey_info *pkey_info)
 {
-	pr_info("key encryption algo %d\n", pkey_info->key_enc_algo);
+	if (!trusted_debug)
+		return;
+
+	pr_debug("key encryption algo %d\n", pkey_info->key_enc_algo);
 }
 #else
 static inline void dump_options(const struct caam_pkey_info *pkey_info)
diff --git a/security/keys/trusted-keys/trusted_core.c b/security/keys/trusted-keys/trusted_core.c
index 9046123d94de..9ce2459d14b4 100644
--- a/security/keys/trusted-keys/trusted_core.c
+++ b/security/keys/trusted-keys/trusted_core.c
@@ -31,6 +31,12 @@ static char *trusted_rng = "default";
 module_param_named(rng, trusted_rng, charp, 0);
 MODULE_PARM_DESC(rng, "Select trusted key RNG");
 
+#ifdef CONFIG_TRUSTED_KEYS_DEBUG
+bool trusted_debug;
+module_param_named(debug, trusted_debug, bool, 0);
+MODULE_PARM_DESC(debug, "Enable trusted keys debug traces (default: 0)");
+#endif
+
 static char *trusted_key_source;
 module_param_named(source, trusted_key_source, charp, 0);
 MODULE_PARM_DESC(source, "Select trusted keys source (tpm, tee, caam, dcp or pkwm)");
diff --git a/security/keys/trusted-keys/trusted_tpm1.c b/security/keys/trusted-keys/trusted_tpm1.c
index c865c97aa1b4..b9fa2b4205cf 100644
--- a/security/keys/trusted-keys/trusted_tpm1.c
+++ b/security/keys/trusted-keys/trusted_tpm1.c
@@ -46,38 +46,44 @@ enum {
 	SRK_keytype = 4
 };
 
-#define TPM_DEBUG 0
-
-#if TPM_DEBUG
+#ifdef CONFIG_TRUSTED_KEYS_DEBUG
 static inline void dump_options(struct trusted_key_options *o)
 {
-	pr_info("sealing key type %d\n", o->keytype);
-	pr_info("sealing key handle %0X\n", o->keyhandle);
-	pr_info("pcrlock %d\n", o->pcrlock);
-	pr_info("pcrinfo %d\n", o->pcrinfo_len);
-	print_hex_dump(KERN_INFO, "pcrinfo ", DUMP_PREFIX_NONE,
-		       16, 1, o->pcrinfo, o->pcrinfo_len, 0);
+	if (!trusted_debug)
+		return;
+
+	pr_debug("sealing key type %d\n", o->keytype);
+	pr_debug("sealing key handle %0X\n", o->keyhandle);
+	pr_debug("pcrlock %d\n", o->pcrlock);
+	pr_debug("pcrinfo %d\n", o->pcrinfo_len);
+	print_hex_dump_debug("pcrinfo ", DUMP_PREFIX_NONE,
+			     16, 1, o->pcrinfo, o->pcrinfo_len, 0);
 }
 
 static inline void dump_sess(struct osapsess *s)
 {
-	print_hex_dump(KERN_INFO, "trusted-key: handle ", DUMP_PREFIX_NONE,
-		       16, 1, &s->handle, 4, 0);
-	pr_info("secret:\n");
-	print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE,
-		       16, 1, &s->secret, SHA1_DIGEST_SIZE, 0);
-	pr_info("trusted-key: enonce:\n");
-	print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE,
-		       16, 1, &s->enonce, SHA1_DIGEST_SIZE, 0);
+	if (!trusted_debug)
+		return;
+
+	print_hex_dump_debug("trusted-key: handle ", DUMP_PREFIX_NONE,
+			     16, 1, &s->handle, 4, 0);
+	pr_debug("secret:\n");
+	print_hex_dump_debug("", DUMP_PREFIX_NONE,
+			     16, 1, &s->secret, SHA1_DIGEST_SIZE, 0);
+	pr_debug("trusted-key: enonce:\n");
+	print_hex_dump_debug("", DUMP_PREFIX_NONE,
+			     16, 1, &s->enonce, SHA1_DIGEST_SIZE, 0);
 }
 
 static inline void dump_tpm_buf(unsigned char *buf)
 {
 	int len;
 
-	pr_info("\ntpm buffer\n");
+	if (!trusted_debug)
+		return;
+	pr_debug("\ntpm buffer\n");
 	len = LOAD32(buf, TPM_SIZE_OFFSET);
-	print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE, 16, 1, buf, len, 0);
+	print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, buf, len, 0);
 }
 #else
 static inline void dump_options(struct trusted_key_options *o)
-- 
2.39.5


^ permalink raw reply related

* Re: [PATCH v4 2/3] lsm: add backing_file LSM hooks
From: Christian Brauner @ 2026-04-09 13:32 UTC (permalink / raw)
  To: Paul Moore
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Amir Goldstein, Gao Xiang
In-Reply-To: <20260403030848.731867-7-paul@paul-moore.com>

On Thu, Apr 02, 2026 at 11:08:34PM -0400, Paul Moore wrote:
> Stacked filesystems such as overlayfs do not currently provide the
> necessary mechanisms for LSMs to properly enforce access controls on the
> mmap() and mprotect() operations.  In order to resolve this gap, a LSM
> security blob is being added to the backing_file struct and the following
> new LSM hooks are being created:
> 
>  security_backing_file_alloc()
>  security_backing_file_free()
>  security_mmap_backing_file()
> 
> The first two hooks are to manage the lifecycle of the LSM security blob
> in the backing_file struct, while the third provides a new mmap() access
> control point for the underlying backing file.  It is also expected that
> LSMs will likely want to update their security_file_mprotect() callback
> to address issues with their mprotect() controls, but that does not
> require a change to the security_file_mprotect() LSM hook.
> 
> There are a three other small changes to support these new LSM hooks:
> * Pass the user file associated with a backing file down to
> alloc_empty_backing_file() so it can be included in the
> security_backing_file_alloc() hook.
> * Add getter and setter functions for the backing_file struct LSM blob
> as the backing_file struct remains private to fs/file_table.c.
> * Constify the file struct field in the LSM common_audit_data struct to
> better support LSMs that need to pass a const file struct pointer into
> the common LSM audit code.
> 
> Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL()
> and supplying a fixup.
> 
> Cc: stable@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-erofs@lists.ozlabs.org
> Signed-off-by: Paul Moore <paul@paul-moore.com>
> ---

This looks very palatable now, thanks.
Reviewed-by: Christian Brauner <brauner@kernel.org>

>  fs/backing-file.c             |  18 ++++--
>  fs/erofs/ishare.c             |  10 +++-
>  fs/file_table.c               |  27 +++++++--
>  fs/fuse/passthrough.c         |   2 +-
>  fs/internal.h                 |   3 +-
>  fs/overlayfs/dir.c            |   2 +-
>  fs/overlayfs/file.c           |   2 +-
>  include/linux/backing-file.h  |   4 +-
>  include/linux/fs.h            |  13 +++++
>  include/linux/lsm_audit.h     |   2 +-
>  include/linux/lsm_hook_defs.h |   5 ++
>  include/linux/lsm_hooks.h     |   1 +
>  include/linux/security.h      |  22 ++++++++
>  security/lsm.h                |   1 +
>  security/lsm_init.c           |   9 +++
>  security/security.c           | 102 ++++++++++++++++++++++++++++++++++
>  16 files changed, 206 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/backing-file.c b/fs/backing-file.c
> index 45da8600d564..1f3bbfc75882 100644
> --- a/fs/backing-file.c
> +++ b/fs/backing-file.c
> @@ -12,6 +12,7 @@
>  #include <linux/backing-file.h>
>  #include <linux/splice.h>
>  #include <linux/mm.h>
> +#include <linux/security.h>
>  
>  #include "internal.h"
>  
> @@ -29,14 +30,15 @@
>   * returned file into a container structure that also stores the stacked
>   * file's path, which can be retrieved using backing_file_user_path().
>   */
> -struct file *backing_file_open(const struct path *user_path, int flags,
> +struct file *backing_file_open(const struct file *user_file, int flags,
>  			       const struct path *real_path,
>  			       const struct cred *cred)
>  {
> +	const struct path *user_path = &user_file->f_path;
>  	struct file *f;
>  	int error;
>  
> -	f = alloc_empty_backing_file(flags, cred);
> +	f = alloc_empty_backing_file(flags, cred, user_file);
>  	if (IS_ERR(f))
>  		return f;
>  
> @@ -52,15 +54,16 @@ struct file *backing_file_open(const struct path *user_path, int flags,
>  }
>  EXPORT_SYMBOL_GPL(backing_file_open);
>  
> -struct file *backing_tmpfile_open(const struct path *user_path, int flags,
> +struct file *backing_tmpfile_open(const struct file *user_file, int flags,
>  				  const struct path *real_parentpath,
>  				  umode_t mode, const struct cred *cred)
>  {
>  	struct mnt_idmap *real_idmap = mnt_idmap(real_parentpath->mnt);
> +	const struct path *user_path = &user_file->f_path;
>  	struct file *f;
>  	int error;
>  
> -	f = alloc_empty_backing_file(flags, cred);
> +	f = alloc_empty_backing_file(flags, cred, user_file);
>  	if (IS_ERR(f))
>  		return f;
>  
> @@ -336,8 +339,13 @@ int backing_file_mmap(struct file *file, struct vm_area_struct *vma,
>  
>  	vma_set_file(vma, file);
>  
> -	scoped_with_creds(ctx->cred)
> +	scoped_with_creds(ctx->cred) {
> +		ret = security_mmap_backing_file(vma, file, user_file);
> +		if (ret)
> +			return ret;
> +
>  		ret = vfs_mmap(vma->vm_file, vma);
> +	}
>  
>  	if (ctx->accessed)
>  		ctx->accessed(user_file);
> diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
> index ec433bacc592..6ed66b17359b 100644
> --- a/fs/erofs/ishare.c
> +++ b/fs/erofs/ishare.c
> @@ -4,6 +4,7 @@
>   */
>  #include <linux/xxhash.h>
>  #include <linux/mount.h>
> +#include <linux/security.h>
>  #include "internal.h"
>  #include "xattr.h"
>  
> @@ -106,7 +107,8 @@ static int erofs_ishare_file_open(struct inode *inode, struct file *file)
>  
>  	if (file->f_flags & O_DIRECT)
>  		return -EINVAL;
> -	realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred());
> +	realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred(),
> +					    file);
>  	if (IS_ERR(realfile))
>  		return PTR_ERR(realfile);
>  	ihold(sharedinode);
> @@ -150,8 +152,14 @@ static ssize_t erofs_ishare_file_read_iter(struct kiocb *iocb,
>  static int erofs_ishare_mmap(struct file *file, struct vm_area_struct *vma)
>  {
>  	struct file *realfile = file->private_data;
> +	int err;
>  
>  	vma_set_file(vma, realfile);
> +
> +	err = security_mmap_backing_file(vma, realfile, file);
> +	if (err)
> +		return err;
> +
>  	return generic_file_readonly_mmap(file, vma);
>  }
>  
> diff --git a/fs/file_table.c b/fs/file_table.c
> index 3b3792903185..d19d879b6efc 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -50,6 +50,9 @@ struct backing_file {
>  		struct path user_path;
>  		freeptr_t bf_freeptr;
>  	};
> +#ifdef CONFIG_SECURITY
> +	void *security;
> +#endif
>  };
>  
>  #define backing_file(f) container_of(f, struct backing_file, file)
> @@ -66,8 +69,21 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
>  }
>  EXPORT_SYMBOL_GPL(backing_file_set_user_path);
>  
> +#ifdef CONFIG_SECURITY
> +void *backing_file_security(const struct file *f)
> +{
> +	return backing_file(f)->security;
> +}
> +
> +void backing_file_set_security(struct file *f, void *security)
> +{
> +	backing_file(f)->security = security;
> +}
> +#endif /* CONFIG_SECURITY */
> +
>  static inline void backing_file_free(struct backing_file *ff)
>  {
> +	security_backing_file_free(&ff->file);
>  	path_put(&ff->user_path);
>  	kmem_cache_free(bfilp_cachep, ff);
>  }
> @@ -288,10 +304,12 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
>  	return f;
>  }
>  
> -static int init_backing_file(struct backing_file *ff)
> +static int init_backing_file(struct backing_file *ff,
> +			     const struct file *user_file)
>  {
>  	memset(&ff->user_path, 0, sizeof(ff->user_path));
> -	return 0;
> +	backing_file_set_security(&ff->file, NULL);
> +	return security_backing_file_alloc(&ff->file, user_file);
>  }
>  
>  /*
> @@ -301,7 +319,8 @@ static int init_backing_file(struct backing_file *ff)
>   * This is only for kernel internal use, and the allocate file must not be
>   * installed into file tables or such.
>   */
> -struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
> +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> +				      const struct file *user_file)
>  {
>  	struct backing_file *ff;
>  	int error;
> @@ -318,7 +337,7 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
>  
>  	/* The f_mode flags must be set before fput(). */
>  	ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
> -	error = init_backing_file(ff);
> +	error = init_backing_file(ff, user_file);
>  	if (unlikely(error)) {
>  		fput(&ff->file);
>  		return ERR_PTR(error);
> diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c
> index 72de97c03d0e..f2d08ac2459b 100644
> --- a/fs/fuse/passthrough.c
> +++ b/fs/fuse/passthrough.c
> @@ -167,7 +167,7 @@ struct fuse_backing *fuse_passthrough_open(struct file *file, int backing_id)
>  		goto out;
>  
>  	/* Allocate backing file per fuse file to store fuse path */
> -	backing_file = backing_file_open(&file->f_path, file->f_flags,
> +	backing_file = backing_file_open(file, file->f_flags,
>  					 &fb->file->f_path, fb->cred);
>  	err = PTR_ERR(backing_file);
>  	if (IS_ERR(backing_file)) {
> diff --git a/fs/internal.h b/fs/internal.h
> index cbc384a1aa09..77e90e4124e0 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -106,7 +106,8 @@ extern void chroot_fs_refs(const struct path *, const struct path *);
>   */
>  struct file *alloc_empty_file(int flags, const struct cred *cred);
>  struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred);
> -struct file *alloc_empty_backing_file(int flags, const struct cred *cred);
> +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> +				      const struct file *user_file);
>  void backing_file_set_user_path(struct file *f, const struct path *path);
>  
>  static inline void file_put_write_access(struct file *file)
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index ff3dbd1ca61f..f2f20a611af3 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -1374,7 +1374,7 @@ static int ovl_create_tmpfile(struct file *file, struct dentry *dentry,
>  				return PTR_ERR(cred);
>  
>  			ovl_path_upper(dentry->d_parent, &realparentpath);
> -			realfile = backing_tmpfile_open(&file->f_path, flags, &realparentpath,
> +			realfile = backing_tmpfile_open(file, flags, &realparentpath,
>  							mode, current_cred());
>  			err = PTR_ERR_OR_ZERO(realfile);
>  			pr_debug("tmpfile/open(%pd2, 0%o) = %i\n", realparentpath.dentry, mode, err);
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 97bed2286030..27cc07738f33 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -48,7 +48,7 @@ static struct file *ovl_open_realfile(const struct file *file,
>  			if (!inode_owner_or_capable(real_idmap, realinode))
>  				flags &= ~O_NOATIME;
>  
> -			realfile = backing_file_open(file_user_path(file),
> +			realfile = backing_file_open(file,
>  						     flags, realpath, current_cred());
>  		}
>  	}
> diff --git a/include/linux/backing-file.h b/include/linux/backing-file.h
> index 1476a6ed1bfd..c939cd222730 100644
> --- a/include/linux/backing-file.h
> +++ b/include/linux/backing-file.h
> @@ -18,10 +18,10 @@ struct backing_file_ctx {
>  	void (*end_write)(struct kiocb *iocb, ssize_t);
>  };
>  
> -struct file *backing_file_open(const struct path *user_path, int flags,
> +struct file *backing_file_open(const struct file *user_file, int flags,
>  			       const struct path *real_path,
>  			       const struct cred *cred);
> -struct file *backing_tmpfile_open(const struct path *user_path, int flags,
> +struct file *backing_tmpfile_open(const struct file *user_file, int flags,
>  				  const struct path *real_parentpath,
>  				  umode_t mode, const struct cred *cred);
>  ssize_t backing_file_read_iter(struct file *file, struct iov_iter *iter,
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 8b3dd145b25e..d0d0e8f55589 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2475,6 +2475,19 @@ struct file *dentry_create(struct path *path, int flags, umode_t mode,
>  			   const struct cred *cred);
>  const struct path *backing_file_user_path(const struct file *f);
>  
> +#ifdef CONFIG_SECURITY
> +void *backing_file_security(const struct file *f);
> +void backing_file_set_security(struct file *f, void *security);
> +#else
> +static inline void *backing_file_security(const struct file *f)
> +{
> +	return NULL;
> +}
> +static inline void backing_file_set_security(struct file *f, void *security)
> +{
> +}
> +#endif /* CONFIG_SECURITY */
> +
>  /*
>   * When mmapping a file on a stackable filesystem (e.g., overlayfs), the file
>   * stored in ->vm_file is a backing file whose f_inode is on the underlying
> diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
> index 382c56a97bba..584db296e43b 100644
> --- a/include/linux/lsm_audit.h
> +++ b/include/linux/lsm_audit.h
> @@ -94,7 +94,7 @@ struct common_audit_data {
>  #endif
>  		char *kmod_name;
>  		struct lsm_ioctlop_audit *op;
> -		struct file *file;
> +		const struct file *file;
>  		struct lsm_ibpkey_audit *ibpkey;
>  		struct lsm_ibendport_audit *ibendport;
>  		int reason;
> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> index 8c42b4bde09c..b4958167e381 100644
> --- a/include/linux/lsm_hook_defs.h
> +++ b/include/linux/lsm_hook_defs.h
> @@ -191,6 +191,9 @@ LSM_HOOK(int, 0, file_permission, struct file *file, int mask)
>  LSM_HOOK(int, 0, file_alloc_security, struct file *file)
>  LSM_HOOK(void, LSM_RET_VOID, file_release, struct file *file)
>  LSM_HOOK(void, LSM_RET_VOID, file_free_security, struct file *file)
> +LSM_HOOK(int, 0, backing_file_alloc, struct file *backing_file,
> +	 const struct file *user_file)
> +LSM_HOOK(void, LSM_RET_VOID, backing_file_free, struct file *backing_file)
>  LSM_HOOK(int, 0, file_ioctl, struct file *file, unsigned int cmd,
>  	 unsigned long arg)
>  LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
> @@ -198,6 +201,8 @@ LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
>  LSM_HOOK(int, 0, mmap_addr, unsigned long addr)
>  LSM_HOOK(int, 0, mmap_file, struct file *file, unsigned long reqprot,
>  	 unsigned long prot, unsigned long flags)
> +LSM_HOOK(int, 0, mmap_backing_file, struct vm_area_struct *vma,
> +	 struct file *backing_file, struct file *user_file)
>  LSM_HOOK(int, 0, file_mprotect, struct vm_area_struct *vma,
>  	 unsigned long reqprot, unsigned long prot)
>  LSM_HOOK(int, 0, file_lock, struct file *file, unsigned int cmd)
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index d48bf0ad26f4..b4f8cad53ddb 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -104,6 +104,7 @@ struct security_hook_list {
>  struct lsm_blob_sizes {
>  	unsigned int lbs_cred;
>  	unsigned int lbs_file;
> +	unsigned int lbs_backing_file;
>  	unsigned int lbs_ib;
>  	unsigned int lbs_inode;
>  	unsigned int lbs_sock;
> diff --git a/include/linux/security.h b/include/linux/security.h
> index ee88dd2d2d1f..8d2d4856934e 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -472,11 +472,17 @@ int security_file_permission(struct file *file, int mask);
>  int security_file_alloc(struct file *file);
>  void security_file_release(struct file *file);
>  void security_file_free(struct file *file);
> +int security_backing_file_alloc(struct file *backing_file,
> +				const struct file *user_file);
> +void security_backing_file_free(struct file *backing_file);
>  int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
>  int security_file_ioctl_compat(struct file *file, unsigned int cmd,
>  			       unsigned long arg);
>  int security_mmap_file(struct file *file, unsigned long prot,
>  			unsigned long flags);
> +int security_mmap_backing_file(struct vm_area_struct *vma,
> +			       struct file *backing_file,
> +			       struct file *user_file);
>  int security_mmap_addr(unsigned long addr);
>  int security_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
>  			   unsigned long prot);
> @@ -1141,6 +1147,15 @@ static inline void security_file_release(struct file *file)
>  static inline void security_file_free(struct file *file)
>  { }
>  
> +static inline int security_backing_file_alloc(struct file *backing_file,
> +					      const struct file *user_file)
> +{
> +	return 0;
> +}
> +
> +static inline void security_backing_file_free(struct file *backing_file)
> +{ }
> +
>  static inline int security_file_ioctl(struct file *file, unsigned int cmd,
>  				      unsigned long arg)
>  {
> @@ -1160,6 +1175,13 @@ static inline int security_mmap_file(struct file *file, unsigned long prot,
>  	return 0;
>  }
>  
> +static inline int security_mmap_backing_file(struct vm_area_struct *vma,
> +					     struct file *backing_file,
> +					     struct file *user_file)
> +{
> +	return 0;
> +}
> +
>  static inline int security_mmap_addr(unsigned long addr)
>  {
>  	return cap_mmap_addr(addr);
> diff --git a/security/lsm.h b/security/lsm.h
> index db77cc83e158..32f808ad4335 100644
> --- a/security/lsm.h
> +++ b/security/lsm.h
> @@ -29,6 +29,7 @@ extern struct lsm_blob_sizes blob_sizes;
>  
>  /* LSM blob caches */
>  extern struct kmem_cache *lsm_file_cache;
> +extern struct kmem_cache *lsm_backing_file_cache;
>  extern struct kmem_cache *lsm_inode_cache;
>  
>  /* LSM blob allocators */
> diff --git a/security/lsm_init.c b/security/lsm_init.c
> index 573e2a7250c4..7c0fd17f1601 100644
> --- a/security/lsm_init.c
> +++ b/security/lsm_init.c
> @@ -293,6 +293,8 @@ static void __init lsm_prepare(struct lsm_info *lsm)
>  	blobs = lsm->blobs;
>  	lsm_blob_size_update(&blobs->lbs_cred, &blob_sizes.lbs_cred);
>  	lsm_blob_size_update(&blobs->lbs_file, &blob_sizes.lbs_file);
> +	lsm_blob_size_update(&blobs->lbs_backing_file,
> +			     &blob_sizes.lbs_backing_file);
>  	lsm_blob_size_update(&blobs->lbs_ib, &blob_sizes.lbs_ib);
>  	/* inode blob gets an rcu_head in addition to LSM blobs. */
>  	if (blobs->lbs_inode && blob_sizes.lbs_inode == 0)
> @@ -441,6 +443,8 @@ int __init security_init(void)
>  	if (lsm_debug) {
>  		lsm_pr("blob(cred) size %d\n", blob_sizes.lbs_cred);
>  		lsm_pr("blob(file) size %d\n", blob_sizes.lbs_file);
> +		lsm_pr("blob(backing_file) size %d\n",
> +		       blob_sizes.lbs_backing_file);
>  		lsm_pr("blob(ib) size %d\n", blob_sizes.lbs_ib);
>  		lsm_pr("blob(inode) size %d\n", blob_sizes.lbs_inode);
>  		lsm_pr("blob(ipc) size %d\n", blob_sizes.lbs_ipc);
> @@ -462,6 +466,11 @@ int __init security_init(void)
>  		lsm_file_cache = kmem_cache_create("lsm_file_cache",
>  						   blob_sizes.lbs_file, 0,
>  						   SLAB_PANIC, NULL);
> +	if (blob_sizes.lbs_backing_file)
> +		lsm_backing_file_cache = kmem_cache_create(
> +						   "lsm_backing_file_cache",
> +						   blob_sizes.lbs_backing_file,
> +						   0, SLAB_PANIC, NULL);
>  	if (blob_sizes.lbs_inode)
>  		lsm_inode_cache = kmem_cache_create("lsm_inode_cache",
>  						    blob_sizes.lbs_inode, 0,
> diff --git a/security/security.c b/security/security.c
> index a26c1474e2e4..048560ef6a1a 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -82,6 +82,7 @@ const struct lsm_id *lsm_idlist[MAX_LSM_COUNT];
>  struct lsm_blob_sizes blob_sizes;
>  
>  struct kmem_cache *lsm_file_cache;
> +struct kmem_cache *lsm_backing_file_cache;
>  struct kmem_cache *lsm_inode_cache;
>  
>  #define SECURITY_HOOK_ACTIVE_KEY(HOOK, IDX) security_hook_active_##HOOK##_##IDX
> @@ -173,6 +174,30 @@ static int lsm_file_alloc(struct file *file)
>  	return 0;
>  }
>  
> +/**
> + * lsm_backing_file_alloc - allocate a composite backing file blob
> + * @backing_file: the backing file
> + *
> + * Allocate the backing file blob for all the modules.
> + *
> + * Returns 0, or -ENOMEM if memory can't be allocated.
> + */
> +static int lsm_backing_file_alloc(struct file *backing_file)
> +{
> +	void *blob;
> +
> +	if (!lsm_backing_file_cache) {
> +		backing_file_set_security(backing_file, NULL);
> +		return 0;
> +	}
> +
> +	blob = kmem_cache_zalloc(lsm_backing_file_cache, GFP_KERNEL);
> +	backing_file_set_security(backing_file, blob);
> +	if (!blob)
> +		return -ENOMEM;
> +	return 0;
> +}
> +
>  /**
>   * lsm_blob_alloc - allocate a composite blob
>   * @dest: the destination for the blob
> @@ -2418,6 +2443,57 @@ void security_file_free(struct file *file)
>  	}
>  }
>  
> +/**
> + * security_backing_file_alloc() - Allocate and setup a backing file blob
> + * @backing_file: the backing file
> + * @user_file: the associated user visible file
> + *
> + * Allocate a backing file LSM blob and perform any necessary initialization of
> + * the LSM blob.  There will be some operations where the LSM will not have
> + * access to @user_file after this point, so any important state associated
> + * with @user_file that is important to the LSM should be captured in the
> + * backing file's LSM blob.
> + *
> + * LSM's should avoid taking a reference to @user_file in this hook as it will
> + * result in problems later when the system attempts to drop/put the file
> + * references due to a circular dependency.
> + *
> + * Return: Return 0 if the hook is successful, negative values otherwise.
> + */
> +int security_backing_file_alloc(struct file *backing_file,
> +				const struct file *user_file)
> +{
> +	int rc;
> +
> +	rc = lsm_backing_file_alloc(backing_file);
> +	if (rc)
> +		return rc;
> +	rc = call_int_hook(backing_file_alloc, backing_file, user_file);
> +	if (unlikely(rc))
> +		security_backing_file_free(backing_file);
> +
> +	return rc;
> +}
> +
> +/**
> + * security_backing_file_free() - Free a backing file blob
> + * @backing_file: the backing file
> + *
> + * Free any LSM state associate with a backing file's LSM blob, including the
> + * blob itself.
> + */
> +void security_backing_file_free(struct file *backing_file)
> +{
> +	void *blob = backing_file_security(backing_file);
> +
> +	call_void_hook(backing_file_free, backing_file);
> +
> +	if (blob) {
> +		backing_file_set_security(backing_file, NULL);
> +		kmem_cache_free(lsm_backing_file_cache, blob);
> +	}
> +}
> +
>  /**
>   * security_file_ioctl() - Check if an ioctl is allowed
>   * @file: associated file
> @@ -2506,6 +2582,32 @@ int security_mmap_file(struct file *file, unsigned long prot,
>  			     flags);
>  }
>  
> +/**
> + * security_mmap_backing_file - Check if mmap'ing a backing file is allowed
> + * @vma: the vm_area_struct for the mmap'd region
> + * @backing_file: the backing file being mmap'd
> + * @user_file: the user file being mmap'd
> + *
> + * Check permissions for a mmap operation on a stacked filesystem.  This hook
> + * is called after the security_mmap_file() and is responsible for authorizing
> + * the mmap on @backing_file.  It is important to note that the mmap operation
> + * on @user_file has already been authorized and the @vma->vm_file has been
> + * set to @backing_file.
> + *
> + * Return: Returns 0 if permission is granted.
> + */
> +int security_mmap_backing_file(struct vm_area_struct *vma,
> +			       struct file *backing_file,
> +			       struct file *user_file)
> +{
> +	/* recommended by the stackable filesystem devs */
> +	if (WARN_ON_ONCE(!(backing_file->f_mode & FMODE_BACKING)))
> +		return -EIO;
> +
> +	return call_int_hook(mmap_backing_file, vma, backing_file, user_file);
> +}
> +EXPORT_SYMBOL_GPL(security_mmap_backing_file);
> +
>  /**
>   * security_mmap_addr() - Check if mmap'ing an address is allowed
>   * @addr: address
> -- 
> 2.53.0
> 

^ permalink raw reply

* Re: [RFC PATCH v1 02/11] security: Add LSM_AUDIT_DATA_NS for namespace audit records
From: Christian Brauner @ 2026-04-09 13:29 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, Paul Moore, Serge E . Hallyn, Justin Suess,
	Lennart Poettering, Mikhail Ivanov, Nicolas Bouchinet,
	Shervin Oloumi, Tingmao Wang, kernel-team, linux-fsdevel,
	linux-kernel, linux-security-module
In-Reply-To: <20260401.AhPieg6heeth@digikod.net>

On Wed, Apr 01, 2026 at 08:48:23PM +0200, Mickaël Salaün wrote:
> On Wed, Apr 01, 2026 at 06:38:34PM +0200, Mickaël Salaün wrote:
> > On Wed, Mar 25, 2026 at 01:32:42PM +0100, Christian Brauner wrote:
> > > On Thu, Mar 12, 2026 at 11:04:35AM +0100, Mickaël Salaün wrote:
> > > > Add a new LSM audit data type LSM_AUDIT_DATA_NS that logs namespace
> > > > information in audit records.  Two fields are provided, matching the
> > > > field names of struct ns_common:
> > > > 
> > > > - ns_type: the CLONE_NEW* flag identifying the namespace type, logged in
> > > >   hexadecimal.
> > > > 
> > > > - inum: the proc inode number identifying a specific namespace instance.
> > > >   Namespace inode numbers are allocated by proc_alloc_inum() via
> > > >   ida_alloc_max() bounded to UINT_MAX, so the value always fits in 32
> > > >   bits.
> > > > 
> > > > A new audit data type is needed because no existing LSM_AUDIT_DATA_*
> > > > type carries namespace information.  The closest alternatives (e.g.
> > > > LSM_AUDIT_DATA_TASK or LSM_AUDIT_DATA_NONE with custom strings) would
> > > > either lose the namespace type or require ad-hoc formatting that
> > > > bypasses the structured audit data union.
> > > > 
> > > > Cc: Christian Brauner <brauner@kernel.org>
> > > > Cc: Günther Noack <gnoack@google.com>
> > > > Cc: Paul Moore <paul@paul-moore.com>
> > > > Signed-off-by: Mickaël Salaün <mic@digikod.net>
> > > > ---
> > > >  include/linux/lsm_audit.h | 5 +++++
> > > >  security/lsm_audit.c      | 4 ++++
> > > >  2 files changed, 9 insertions(+)
> > > > 
> > > > diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
> > > > index 382c56a97bba..6e20a56b8c22 100644
> > > > --- a/include/linux/lsm_audit.h
> > > > +++ b/include/linux/lsm_audit.h
> > > > @@ -78,6 +78,7 @@ struct common_audit_data {
> > > >  #define LSM_AUDIT_DATA_NOTIFICATION 16
> > > >  #define LSM_AUDIT_DATA_ANONINODE	17
> > > >  #define LSM_AUDIT_DATA_NLMSGTYPE	18
> > > > +#define LSM_AUDIT_DATA_NS		19
> > > >  	union 	{
> > > >  		struct path path;
> > > >  		struct dentry *dentry;
> > > > @@ -100,6 +101,10 @@ struct common_audit_data {
> > > >  		int reason;
> > > >  		const char *anonclass;
> > > >  		u16 nlmsg_type;
> > > > +		struct {
> > > > +			u32 ns_type;
> > > > +			unsigned int inum;
> > > 
> > > fwiw, you might want to start the 64-bit namespace id as well.
> > > But either way:
> > 
> > Right now these numbers are generated by ida_alloc_max(), which return
> > an int.  Is there an ongoing patch series for this change?
> 
> OK, we should not use the inode's number (32-bit) but the namespace ID
> (64-bit) which is readable with the NS_GET_ID IOCTL on the namespace
> FDs.  I'll use that with ns_id instead of inum.  I'll also update the
> Landlock code and tests accordingly.

Yes, it's embedded in struct ns_common.

^ permalink raw reply

* Re: LSM: Whiteout chardev creation sidesteps mknod hook
From: Christian Brauner @ 2026-04-09 12:47 UTC (permalink / raw)
  To: Serge Hallyn, Miklos Szeredi, Amir Goldstein
  Cc: Günther Noack, Mickaël Salaün, Paul Moore,
	linux-security-module
In-Reply-To: <06337e89-349a-4334-a735-b8dc9b566cdd@hallyn.com>

On Tue, Apr 07, 2026 at 12:15:00PM -0500, Serge Hallyn wrote:
> Apr 7, 2026 08:05:43 Günther Noack <gnoack@google.com>:
> 
> > Hello Christian, Paul, Mickaël and LSM maintainers!
> >
> > I discovered the following bug in Landlock, which potentially also
> > affects other LSMs:
> >
> > With renameat2(2)'s RENAME_WHITEOUT flag, it is possible to create a
> > "whiteout object" at the source of the rename.  Whiteout objects are
> > character devices with major/minor (0, 0) -- these devices are not
> > bound to any driver, so they are harmless, but still, the creation of
> > these files can sidestep the LANDLOCK_ACCESS_FS_MAKE_CHAR access right
> > in Landlock.

They aren't devices.

> >
> >
> > I am unconvinced which is the right fix here -- do you have an opinion
> > on this from the VFS/LSM side?
> >
> >
> > Option 1: Make filesystems call security_path_mknod() during RENAME_WHITEOUT?

No.

> >
> > Do it in the VFS rename hook.
> >
> > * Pro: Fixes it for all LSMs
> > * Con: Call would have to be done in multiple filesystems
> >
> >
> > Option 2: Handle it in security_{path,inode}_rename()
> >
> > Make Landlock handle it in security_inode_rename() by looking for the
> > RENAME_WHITEOUT flag.
> >
> > * Con: Operation should only be denied if the file system even
> >   implements RENAME_WHITEOUT, and we would have to maintain a list of

Why? Just deny RENAME_WHITEOUT. What does it matter if the filesystem
implements it or not. Overlayfs would fall back to non-RENAME_WHITEOUT
if not provided by the upper fs anway.

> >   affected filesystems for that.  (That feels like solving it at the
> >   wrong layer of abstraction.)
> > * Con: Unclear whether other LSMs need a similar fix
> >
> >
> > Option 3: Declare that this is working as intended?
> 
> Option 3 has my vote.

Seconded.

> 
> 
> > * Pro: (0, 0) is not a "real" character device
> >
> >
> > In cases 1 and 2, we'd likely need to double check that we are not
> > breaking existing scenarios involving OverlayFS, by suddenly requiring
> > a more lax policy for creating character devices on these directories.
> >
> > Please let me know what you think.  I'm specifically interested in:
> >
> > 1. Christian: What is the appropriate way to do this VFS wise?
> > 2. LSM maintainers: Is this a bug that affects other LSMs as well?
> >
> > Thanks,
> > —Günther
> >
> > P.S.: For full transparency, I found this bug by pointing Google
> > Gemini at the Landlock codebase.

Clearly.

^ permalink raw reply

* Re: [PATCH v2 0/4] Firmware LSM hook
From: Leon Romanovsky @ 2026-04-09 12:45 UTC (permalink / raw)
  To: Roberto Sassu
  Cc: KP Singh, Matt Bobrowski, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Shuah Khan, Jason Gunthorpe, Saeed Mahameed,
	Itay Avraham, Dave Jiang, Jonathan Cameron, bpf, linux-kernel,
	linux-kselftest, linux-rdma, Chiara Meiohas, Maher Sanalla, paul,
	linux-security-module
In-Reply-To: <2dd138a2ae87f90c55dbc3178d9c798294fd4450.camel@huaweicloud.com>

On Thu, Apr 09, 2026 at 02:27:43PM +0200, Roberto Sassu wrote:
> On Thu, 2026-04-09 at 15:12 +0300, Leon Romanovsky wrote:
> > On Tue, Mar 31, 2026 at 08:56:32AM +0300, Leon Romanovsky wrote:
> > > From Chiara:
> > > 
> > > This patch set introduces a new BPF LSM hook to validate firmware commands
> > > triggered by userspace before they are submitted to the device. The hook
> > > runs after the command buffer is constructed, right before it is sent
> > > to firmware.
> > 
> > <...>
> > 
> > > ---
> > > Chiara Meiohas (4):
> > >       bpf: add firmware command validation hook
> > >       selftests/bpf: add test cases for fw_validate_cmd hook
> > >       RDMA/mlx5: Externally validate FW commands supplied in DEVX interface
> > >       fwctl/mlx5: Externally validate FW commands supplied in fwctl
> > 
> > Hi,
> > 
> > Can we get Ack from BPF/LSM side?
> 
> + Paul, linux-security-module ML
> 
> Hi
> 
> probably you also want to get an Ack from the LSM maintainer (added in
> CC with the list). Most likely, he will also ask you to create the
> security_*() functions counterparts of the BPF hooks.

We implemented this approach in v1:
https://patch.msgid.link/20260309-fw-lsm-hook-v1-0-4a6422e63725@nvidia.com
and were advised to pursue a different direction.

Thanks

> 
> Roberto
> 
> > Thanks
> > 
> > > 
> > >  drivers/fwctl/mlx5/main.c                        | 12 +++++-
> > >  drivers/infiniband/hw/mlx5/devx.c                | 49 ++++++++++++++++++------
> > >  include/linux/bpf_lsm.h                          | 41 ++++++++++++++++++++
> > >  kernel/bpf/bpf_lsm.c                             | 11 ++++++
> > >  tools/testing/selftests/bpf/progs/verifier_lsm.c | 23 +++++++++++
> > >  5 files changed, 122 insertions(+), 14 deletions(-)
> > > ---
> > > base-commit: 11439c4635edd669ae435eec308f4ab8a0804808
> > > change-id: 20260309-fw-lsm-hook-7c094f909ffc
> > > 
> > > Best regards,
> > > --  
> > > Leon Romanovsky <leonro@nvidia.com>
> > > 
> 
> 

^ permalink raw reply

* Re: [PATCH v2 0/4] Firmware LSM hook
From: Roberto Sassu @ 2026-04-09 12:27 UTC (permalink / raw)
  To: Leon Romanovsky, KP Singh, Matt Bobrowski, Alexei Starovoitov,
	Daniel Borkmann, John Fastabend, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Jason Gunthorpe, Saeed Mahameed, Itay Avraham, Dave Jiang,
	Jonathan Cameron
  Cc: bpf, linux-kernel, linux-kselftest, linux-rdma, Chiara Meiohas,
	Maher Sanalla, paul, linux-security-module
In-Reply-To: <20260409121230.GA720371@unreal>

On Thu, 2026-04-09 at 15:12 +0300, Leon Romanovsky wrote:
> On Tue, Mar 31, 2026 at 08:56:32AM +0300, Leon Romanovsky wrote:
> > From Chiara:
> > 
> > This patch set introduces a new BPF LSM hook to validate firmware commands
> > triggered by userspace before they are submitted to the device. The hook
> > runs after the command buffer is constructed, right before it is sent
> > to firmware.
> 
> <...>
> 
> > ---
> > Chiara Meiohas (4):
> >       bpf: add firmware command validation hook
> >       selftests/bpf: add test cases for fw_validate_cmd hook
> >       RDMA/mlx5: Externally validate FW commands supplied in DEVX interface
> >       fwctl/mlx5: Externally validate FW commands supplied in fwctl
> 
> Hi,
> 
> Can we get Ack from BPF/LSM side?

+ Paul, linux-security-module ML

Hi

probably you also want to get an Ack from the LSM maintainer (added in
CC with the list). Most likely, he will also ask you to create the
security_*() functions counterparts of the BPF hooks.

Roberto

> Thanks
> 
> > 
> >  drivers/fwctl/mlx5/main.c                        | 12 +++++-
> >  drivers/infiniband/hw/mlx5/devx.c                | 49 ++++++++++++++++++------
> >  include/linux/bpf_lsm.h                          | 41 ++++++++++++++++++++
> >  kernel/bpf/bpf_lsm.c                             | 11 ++++++
> >  tools/testing/selftests/bpf/progs/verifier_lsm.c | 23 +++++++++++
> >  5 files changed, 122 insertions(+), 14 deletions(-)
> > ---
> > base-commit: 11439c4635edd669ae435eec308f4ab8a0804808
> > change-id: 20260309-fw-lsm-hook-7c094f909ffc
> > 
> > Best regards,
> > --  
> > Leon Romanovsky <leonro@nvidia.com>
> > 


^ permalink raw reply

* Re: [PATCH v4 3/3] selinux: fix overlayfs mmap() and mprotect() access checks
From: Ondrej Mosnacek @ 2026-04-09  9:16 UTC (permalink / raw)
  To: Paul Moore
  Cc: Stephen Smalley, linux-security-module, selinux, linux-fsdevel,
	linux-unionfs, linux-erofs, Amir Goldstein, Gao Xiang,
	Christian Brauner
In-Reply-To: <CAHC9VhQ5PH99EQBuYq4c7Jf82UXiDfC7qzM2kvnZuyH6yFPL_Q@mail.gmail.com>

On Tue, Apr 7, 2026 at 10:21 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Tue, Apr 7, 2026 at 3:20 PM Stephen Smalley
> <stephen.smalley.work@gmail.com> wrote:
> > On Tue, Apr 7, 2026 at 10:35 AM Paul Moore <paul@paul-moore.com> wrote:
> > > On Tue, Apr 7, 2026 at 8:14 AM Stephen Smalley
> > > <stephen.smalley.work@gmail.com> wrote:
> > > > On Thu, Apr 2, 2026 at 11:09 PM Paul Moore <paul@paul-moore.com> wrote:
> > > > >
> > > > > The existing SELinux security model for overlayfs is to allow access if
> > > > > the current task is able to access the top level file (the "user" file)
> > > > > and the mounter's credentials are sufficient to access the lower
> > > > > level file (the "backing" file).  Unfortunately, the current code does
> > > > > not properly enforce these access controls for both mmap() and mprotect()
> > > > > operations on overlayfs filesystems.
> > > > >
> > > > > This patch makes use of the newly created security_mmap_backing_file()
> > > > > LSM hook to provide the missing backing file enforcement for mmap()
> > > > > operations, and leverages the backing file API and new LSM blob to
> > > > > provide the necessary information to properly enforce the mprotect()
> > > > > access controls.
> > > > >
> > > > > Cc: stable@vger.kernel.org
> > > > > Signed-off-by: Paul Moore <paul@paul-moore.com>
> > > >
> > > > Do you have tests for these changes showing the before and after (i.e.
> > > > failing without your patches, passing with them)? I tried running an
> > > > earlier set from Ondrej but they failed.
> > >
> > > A few months ago I sent you and Ondrej some feedback on those early
> > > tests from Ondrej, but yes, I also had problems with Ondrej's tests.
> > > I've been using a hacked up combination of the existing tests, some of
> > > Ondrej's additions, and an additional debug/test patch to ensure the
> > > labeling is correct.  It's far from ideal, but I didn't invest time in
> > > test development as I assumed Ondrej would continue his efforts there
> > > (unfortunately it doesn't appear that he has?), and I wanted to focus
> > > on getting a solution as soon as possible for obvious reasons.
> >
> > Ok, I'm happy to look at even unpolished tests - just want something I
> > can use to exercise the before and after states.
>
> Hopefully Ondrej can provide an updated patch.

Sorry for the radio silence... I just posted the fixed patch to the list.

I also pushed a more targeted standalone TMT/beakerlib test here,
which also tests the dynamic transition situation:
https://src.fedoraproject.org/fork/omos/tests/selinux/blob/overlayfs-mmap-bugs/f/kernel/overlayfs-mmap-bugs

To run it on Fedora, it should be enough to `dnf install -y beakerlib
selinux-policy-devel gcc` and run the runtest.sh script directly.

-- 
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply

* Re: [PATCH v2] KEYS: trusted: Debugging as a feature
From: Nayna Jain @ 2026-04-09  0:41 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-integrity, keyrings, Srish Srinivasan, James Bottomley,
	Mimi Zohar, David Howells, Paul Moore, James Morris,
	Serge E. Hallyn, Ahmad Fatoum, Pengutronix Kernel Team, open list,
	open list:SECURITY SUBSYSTEM
In-Reply-To: <adYRURAJfNCu0FYB@kernel.org>


On 4/8/26 4:26 AM, Jarkko Sakkinen wrote:
> On Mon, Apr 06, 2026 at 10:42:00PM -0400, Nayna Jain wrote:
>> On 3/24/26 7:00 AM, Jarkko Sakkinen wrote:
>>> TPM_DEBUG, and other similar flags, are a non-standard way to specify a
>>> feature in Linux kernel.  Introduce CONFIG_TRUSTED_KEYS_DEBUG for
>>> trusted keys, and use it to replace these ad-hoc feature flags.
>>>
>>> Given that trusted keys debug dumps can contain sensitive data, harden
>>> the feature as follows:
>>>
>>> 1. In the Kconfig description postulate that pr_debug() statements must be
>>>      used.
>>> 2. Use pr_debug() statements in TPM 1.x driver to print the protocol dump.
>>>
>>> Traces, when actually needed, can be easily enabled by providing
>>> trusted.dyndbg='+p' in the kernel command-line.
>>>
>>> Cc: Srish Srinivasan <ssrish@linux.ibm.com>
>>> Reported-by: Nayna Jain <nayna@linux.ibm.com>
>>> Closes: https://lore.kernel.org/all/7f8b8478-5cd8-4d97-bfd0-341fd5cf10f9@linux.ibm.com/
>>> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
>>> ---
>>> v2:
>>> - Implement for all trusted keys backends.
>>> - Add HAVE_TRUSTED_KEYS_DEBUG as it is a good practice despite full
>>>     coverage.
>>> ---
>>>    include/keys/trusted-type.h               | 18 +++++-------
>>>    security/keys/trusted-keys/Kconfig        | 19 ++++++++++++
>>>    security/keys/trusted-keys/trusted_caam.c |  4 +--
>>>    security/keys/trusted-keys/trusted_tpm1.c | 36 +++++++++++------------
>>>    4 files changed, 46 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/include/keys/trusted-type.h b/include/keys/trusted-type.h
>>> index 03527162613f..620a1f890b6b 100644
>>> --- a/include/keys/trusted-type.h
>>> +++ b/include/keys/trusted-type.h
>>> @@ -83,18 +83,16 @@ struct trusted_key_source {
>>>    extern struct key_type key_type_trusted;
>>> -#define TRUSTED_DEBUG 0
>>> -
>>> -#if TRUSTED_DEBUG
>>> +#ifdef CONFIG_TRUSTED_KEYS_DEBUG
>>>    static inline void dump_payload(struct trusted_key_payload *p)
>>>    {
>>> -	pr_info("key_len %d\n", p->key_len);
>>> -	print_hex_dump(KERN_INFO, "key ", DUMP_PREFIX_NONE,
>>> -		       16, 1, p->key, p->key_len, 0);
>>> -	pr_info("bloblen %d\n", p->blob_len);
>>> -	print_hex_dump(KERN_INFO, "blob ", DUMP_PREFIX_NONE,
>>> -		       16, 1, p->blob, p->blob_len, 0);
>>> -	pr_info("migratable %d\n", p->migratable);
>>> +	pr_debug("key_len %d\n", p->key_len);
>>> +	print_hex_dump_debug("key ", DUMP_PREFIX_NONE,
>>> +			     16, 1, p->key, p->key_len, 0);
>>> +	pr_debug("bloblen %d\n", p->blob_len);
>>> +	print_hex_dump_debug("blob ", DUMP_PREFIX_NONE,
>>> +			     16, 1, p->blob, p->blob_len, 0);
>>> +	pr_debug("migratable %d\n", p->migratable);
>>>    }
>>>    #else
>>>    static inline void dump_payload(struct trusted_key_payload *p)
>>> diff --git a/security/keys/trusted-keys/Kconfig b/security/keys/trusted-keys/Kconfig
>>> index 9e00482d886a..2ad9ba0e03f1 100644
>>> --- a/security/keys/trusted-keys/Kconfig
>>> +++ b/security/keys/trusted-keys/Kconfig
>>> @@ -1,10 +1,25 @@
>>>    config HAVE_TRUSTED_KEYS
>>>    	bool
>>> +config HAVE_TRUSTED_KEYS_DEBUG
>>> +	bool
>>> +
>>> +config TRUSTED_KEYS_DEBUG
>>> +	bool "Debug trusted keys"
>>> +	depends on HAVE_TRUSTED_KEYS_DEBUG
>>> +	default n
>>> +	help
>>> +	  Trusted keys backends and core code that support debug dumps
>>> +	  can opt-in that feature here. Dumps must only use DEBUG
>>> +	  level output, as sensitive data may pass by. In the
>>> +	  kernel-command line traces can be enabled via
>>> +	  trusted.dyndbg='+p'.
>> Would it be good idea to add an explicit note/warning:
>>
>>
>> NOTE: This option is intended for debugging purposes only. Do not enable on
>> production systems as debug output may expose sensitive cryptographic
>> material.
>> If you are unsure, say N.
>>
>> Apart from this, looks good to me.
>>
>> Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
> Thank, I'll add your tag but would you mind quickly screening v3 again
> where I add "trusted.debug=0|1". And yes, your suggestion about extra
> warning makes sense.
Sure Jarkko.. However, I don't see v3 version in my inbox or in 
linux-integrity. Or you are about to post it soon.
>
> Let's make this safe as possible. Mistakes do happen... and then those
> measures pay off :-)
Yes agree.
>
> BR, Jarkko

Thanks & Regards,

     - Nayna


^ permalink raw reply

* Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets
From: Mickaël Salaün @ 2026-04-08 19:21 UTC (permalink / raw)
  To: Justin Suess
  Cc: andrii, ast, bpf, brauner, daniel, eddyz87, fred, gnoack, jack,
	jmorris, john.fastabend, kees, kpsingh, linux-fsdevel,
	linux-kernel, linux-security-module, m, martin.lau, paul
In-Reply-To: <20260408171030.4083129-1-utilityemal77@gmail.com>

On Wed, Apr 08, 2026 at 01:10:28PM -0400, Justin Suess wrote:
> 
> Add a flag LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS, which executes
> task_set_no_new_privs on the current credentials, but only if
> the process lacks the CAP_SYS_ADMIN capability.
> 
> While this operation is redundant for code running from userspace
> (indeed callers may achieve the same logic by calling
> prctl w/ PR_SET_NO_NEW_PRIVS), this flag enables callers without access
> to the syscall abi (defined in subsequent patches) to restrict processes
> from gaining additional capabilities. This is important to ensure that
> consumers can meet the task_no_new_privs || CAP_SYS_ADMIN invariant
> enforced by Landlock without having syscall access.
> 
> This is done by hooking bprm_committing_creds along with a
> landlock_cred_security flag to indicate that the next execution should
> task_set_no_new_privs if the process doesn't possess CAP_SYS_ADMIN. This
> is done to ensure that task_set_no_new_privs is being done past the
> point of no return.
> 
> Cc: Mickaël Salaün <mic@digikod.net>
> Signed-off-by: Justin Suess <utilityemal77@gmail.com>
> ---
> 
> On Wed, Apr 08, 2026 at 02:00:00 -0000, Mickaël Salaün wrote:
> > > Points of Feedback
> > > ===
> > > 
> > > First, the new set_nnp_on_point_of_no_return field in struct linux_binprm.
> > > This field was needed to request that task_set_no_new_privs be set during an
> > > execution, but only after the execution has proceeded beyond the point of no
> > > return. I couldn't find a way to express this semantic without adding a new
> > > bitfield to struct linux_binprm and a conditional in fs/exec.c. Please see
> > > patch 2.
> 
> > What about using security_bprm_committing_creds()?
> 
> Good idea. Definitely cleaner.
> 
> Something like this? Then dropping the "execve: Add set_nnp_on_point_of_no_return"
> commit.
> 
> This adds a bitfield to the landlock_cred_security struct to indicate that the flag
> should be set on the next exec(s).
> 
>  include/uapi/linux/landlock.h | 14 ++++++++++++++
>  security/landlock/cred.c      | 13 +++++++++++++
>  security/landlock/cred.h      |  7 +++++++
>  security/landlock/limits.h    |  2 +-
>  security/landlock/ruleset.c   | 15 ++++++++++++---
>  security/landlock/syscalls.c  |  5 +++++
>  6 files changed, 52 insertions(+), 4 deletions(-)
> 
> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> index f88fa1f68b77..edd9d9a7f60e 100644
> --- a/include/uapi/linux/landlock.h
> +++ b/include/uapi/linux/landlock.h
> @@ -129,12 +129,26 @@ struct landlock_ruleset_attr {
>   *
>   *     If the calling thread is running with no_new_privs, this operation
>   *     enables no_new_privs on the sibling threads as well.
> + *
> + * %LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
> + *    Sets no_new_privs on the calling thread before applying the Landlock domain.
> + *    This flag is useful for convenience as well as for applying a ruleset from
> + *    an outside context (e.g BPF). This flag only has an effect on when both
> + *    no_new_privs isn't already set and the caller doesn't possess CAP_SYS_ADMIN.
> + *
> + *    This flag has slightly different behavior when used from BPF. Instead of
> + *    setting no_new_privs on the current task, it sets a flag on the bprm so that
> + *    no_new_privs is set on the task at exec point-of-no-return. This guarantees
> + *    that the current execution is unaffected, and may escalate as usual until the
> + *    next exec, but the resulting task cannot gain more privileges through later
> + *    exec transitions.
>   */
>  /* clang-format off */
>  #define LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF		(1U << 0)
>  #define LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON			(1U << 1)
>  #define LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF		(1U << 2)
>  #define LANDLOCK_RESTRICT_SELF_TSYNC				(1U << 3)
> +#define LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS			(1U << 4)
>  /* clang-format on */
>  
>  /**
> diff --git a/security/landlock/cred.c b/security/landlock/cred.c
> index 0cb3edde4d18..bcc9b716916f 100644
> --- a/security/landlock/cred.c
> +++ b/security/landlock/cred.c
> @@ -43,6 +43,18 @@ static void hook_cred_free(struct cred *const cred)
>  		landlock_put_ruleset_deferred(dom);
>  }
>  
> +static void hook_bprm_committing_creds(const struct linux_binprm *bprm)
> +{
> +	struct landlock_cred_security *const llcred = landlock_cred(bprm->cred);
> +
> +	if (llcred->set_nnp_on_committing_creds &&
> +	    !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN)) {

If asked by the caller, NNP must be set, whatever the capabilities of
the task.

> +		task_set_no_new_privs(current);
> +		/* Don't need to set it again for subsequent execution. */
> +		llcred->set_nnp_on_committing_creds = false;
> +	}

Thinking more about it, it would make more sense to add another flag to
enforce restriction on the next exec.  This new cred bit would then be
generic and enforce both NNP (if set) and the domain once we know the
execution is ok.  That should also bring the required plumbing to
create the domain at syscall (or kfunc) time and handle memory
allocation issue there, but only enforce it at exec time with
security_bprm_committing_creds() (without any possible error).

> +}
> +
>  #ifdef CONFIG_AUDIT
>  
>  static int hook_bprm_creds_for_exec(struct linux_binprm *const bprm)
> @@ -55,6 +67,7 @@ static int hook_bprm_creds_for_exec(struct linux_binprm *const bprm)
>  #endif /* CONFIG_AUDIT */
>  
>  static struct security_hook_list landlock_hooks[] __ro_after_init = {
> +	LSM_HOOK_INIT(bprm_committing_creds, hook_bprm_committing_creds),
>  	LSM_HOOK_INIT(cred_prepare, hook_cred_prepare),
>  	LSM_HOOK_INIT(cred_transfer, hook_cred_transfer),
>  	LSM_HOOK_INIT(cred_free, hook_cred_free),
> diff --git a/security/landlock/cred.h b/security/landlock/cred.h
> index c10a06727eb1..7ec6dd12ebc3 100644
> --- a/security/landlock/cred.h
> +++ b/security/landlock/cred.h
> @@ -49,6 +49,13 @@ struct landlock_cred_security {
>  	 * not require a current domain.
>  	 */
>  	u8 log_subdomains_off : 1;
> +	/**
> +	 * @set_nnp_on_committing_creds: Set if the domain should set NO_NEW_PRIVS on the
> +	 * execution past the point of no return in security_bprm_committing_creds().
> +	 * This is not a hierarchy configuration because the nnp state is inherited by
> +	 * exec and doesn't need further configuration.
> +	 */
> +	u8 set_nnp_on_committing_creds : 1;
>  #endif /* CONFIG_AUDIT */
>  } __packed;
>  
> diff --git a/security/landlock/limits.h b/security/landlock/limits.h
> index eb584f47288d..d298086a4180 100644
> --- a/security/landlock/limits.h
> +++ b/security/landlock/limits.h
> @@ -31,7 +31,7 @@
>  #define LANDLOCK_MASK_SCOPE		((LANDLOCK_LAST_SCOPE << 1) - 1)
>  #define LANDLOCK_NUM_SCOPE		__const_hweight64(LANDLOCK_MASK_SCOPE)
>  
> -#define LANDLOCK_LAST_RESTRICT_SELF	LANDLOCK_RESTRICT_SELF_TSYNC
> +#define LANDLOCK_LAST_RESTRICT_SELF	LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
>  #define LANDLOCK_MASK_RESTRICT_SELF	((LANDLOCK_LAST_RESTRICT_SELF << 1) - 1)
>  
>  /* clang-format on */
> diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
> index 1d6fa74f2a52..ad0bd5994ec5 100644
> --- a/security/landlock/ruleset.c
> +++ b/security/landlock/ruleset.c
> @@ -121,11 +121,13 @@ int landlock_restrict_cred_precheck(const __u32 flags,
>  
>  	/*
>  	 * Similar checks as for seccomp(2), except that an -EPERM may be
> -	 * returned.
> +	 * returned, or no_new_privs may be set by the caller via
> +	 * LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS.
>  	 */
>  	if (!task_no_new_privs(current) &&
>  	    !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN)) {
> -		return -EPERM;
> +		if (!(flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS))
> +			return -EPERM;
>  	}
>  
>  	if (flags & ~LANDLOCK_MASK_RESTRICT_SELF)
> @@ -140,7 +142,7 @@ int landlock_restrict_cred(struct cred *const cred,
>  {
>  	struct landlock_cred_security *new_llcred;
>  	bool __maybe_unused log_same_exec, log_new_exec, log_subdomains,
> -		prev_log_subdomains;
> +		prev_log_subdomains, set_nnp_on_committing_creds;
>  
>  	/*
>  	 * It is allowed to set LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF without
> @@ -157,6 +159,12 @@ int landlock_restrict_cred(struct cred *const cred,
>  	log_new_exec = !!(flags & LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON);
>  	/* Translates "off" flag to boolean. */
>  	log_subdomains = !(flags & LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF);
> +	/*
> +	 * Translates "on" flag to boolean. This flag is not inherited by exec,
> +	 * but the resulting nnp state is.
> +	 */
> +	set_nnp_on_committing_creds =
> +		!!(flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS);
>  
>  	new_llcred = landlock_cred(cred);
>  
> @@ -165,6 +173,7 @@ int landlock_restrict_cred(struct cred *const cred,
>  	new_llcred->log_subdomains_off = !prev_log_subdomains ||
>  					 !log_subdomains;
>  #endif /* CONFIG_AUDIT */
> +	new_llcred->set_nnp_on_committing_creds = set_nnp_on_committing_creds;
>  
>  	/*
>  	 * The only case when a ruleset may not be set is if
> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
> index c6c7be7698a2..f3520c764360 100644
> --- a/security/landlock/syscalls.c
> +++ b/security/landlock/syscalls.c
> @@ -397,6 +397,7 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
>   *         - %LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON
>   *         - %LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
>   *         - %LANDLOCK_RESTRICT_SELF_TSYNC
> + *         - %LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
>   *
>   * This system call enforces a Landlock ruleset on the current thread.
>   * Enforcing a ruleset requires that the task has %CAP_SYS_ADMIN in its
> @@ -450,6 +451,10 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
>  	if (!new_cred)
>  		return -ENOMEM;
>  
> +	if (flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS &&
> +	    !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN))
> +		task_set_no_new_privs(current);
> +
>  	err = landlock_restrict_cred(new_cred, ruleset, flags);
>  	if (err) {
>  		abort_creds(new_cred);
> -- 
> 2.53.0
> 
> 

^ permalink raw reply

* [PATCH v2 0/2] Add support for ML-DSA signature for EVM and IMA
From: Stefan Berger @ 2026-04-08 17:41 UTC (permalink / raw)
  To: linux-integrity, linux-security-module
  Cc: linux-kernel, zohar, roberto.sassu, ebiggers, Stefan Berger

Based on IMA sigv3 type of signatures, add support for ML-DSA signature
for EVM and IMA. Use the existing ML-DSA hashless signing mode (pure mode).

   Stefan

v2:
  - Dropped 1/3
  - Using "none" as hash_algo in 2/2

Stefan Berger (2):
  integrity: Refactor asymmetric_verify for reusability
  integrity: Add support for sigv3 verification using ML-DSA keys

 security/integrity/digsig_asymmetric.c | 126 +++++++++++++++++++++----
 1 file changed, 107 insertions(+), 19 deletions(-)


base-commit: 82bbd447199ff1441031d2eaf9afe041550cf525
-- 
2.53.0


^ permalink raw reply

* [PATCH v2 2/2] integrity: Add support for sigv3 verification using ML-DSA keys
From: Stefan Berger @ 2026-04-08 17:41 UTC (permalink / raw)
  To: linux-integrity, linux-security-module
  Cc: linux-kernel, zohar, roberto.sassu, ebiggers, Stefan Berger
In-Reply-To: <20260408174154.139606-1-stefanb@linux.ibm.com>

Add support for sigv3 signature verification using ML-DSA in pure mode.
When a sigv3 signature is verified, first check whether the key to use
for verification is an ML-DSA key and therefore uses a hashless signature
verification scheme. The hashless signature verification method uses the
ima_file_id structure directly for signature verification rather than
its digest.

Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>

---
v2: Set hash_algo in public_key_signature to "none"
---
 security/integrity/digsig_asymmetric.c | 84 ++++++++++++++++++++++++--
 1 file changed, 79 insertions(+), 5 deletions(-)

diff --git a/security/integrity/digsig_asymmetric.c b/security/integrity/digsig_asymmetric.c
index e29ed73f15cd..c80cb2b117a6 100644
--- a/security/integrity/digsig_asymmetric.c
+++ b/security/integrity/digsig_asymmetric.c
@@ -190,17 +190,91 @@ static int calc_file_id_hash(enum evm_ima_xattr_type type,
 	return rc;
 }
 
+/*
+ * asymmetric_verify_v3_hashless - Use hashless signature verification on sigv3
+ * @key: The key to use for signature verification
+ * @pk: The associated public key
+ * @encoding: The encoding the key type uses
+ * @sig: The signature
+ * @siglen: The length of the xattr signature
+ * @algo: The hash algorithm
+ * @digest: The file digest
+ *
+ * Create an ima_file_id structure and use it for signature verification
+ * directly. This can be used for ML-DSA in pure mode for example.
+ */
+static int asymmetric_verify_v3_hashless(struct key *key,
+					 const struct public_key *pk,
+					 const char *encoding,
+					 const char *sig, int siglen,
+					 u8 algo,
+					 const u8 *digest)
+{
+	struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
+	struct ima_file_id file_id = {
+		.hash_type = hdr->type,
+		.hash_algorithm = algo,
+	};
+	size_t digest_size = hash_digest_size[algo];
+	struct public_key_signature pks = {
+		.m = (u8 *)&file_id,
+		.m_size = sizeof(file_id) - (HASH_MAX_DIGESTSIZE - digest_size),
+		.s = hdr->sig,
+		.s_size = siglen - sizeof(*hdr),
+		.pkey_algo = pk->pkey_algo,
+		.hash_algo = "none",
+		.encoding = encoding,
+	};
+	int ret;
+
+	if (hdr->type != IMA_VERITY_DIGSIG &&
+	    hdr->type != EVM_IMA_XATTR_DIGSIG &&
+	    hdr->type != EVM_XATTR_PORTABLE_DIGSIG)
+		return -EINVAL;
+
+	if (pks.s_size != be16_to_cpu(hdr->sig_size))
+		return -EBADMSG;
+
+	memcpy(file_id.hash, digest, digest_size);
+
+	ret = verify_signature(key, &pks);
+	pr_debug("%s() = %d\n", __func__, ret);
+	return ret;
+}
+
 int asymmetric_verify_v3(struct key *keyring, const char *sig, int siglen,
 			 const char *data, int datalen, u8 algo)
 {
 	struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
 	struct ima_max_digest_data hash;
+	const struct public_key *pk;
+	struct key *key;
 	int rc;
 
-	rc = calc_file_id_hash(hdr->type, algo, data, &hash);
-	if (rc)
-		return -EINVAL;
+	if (siglen <= sizeof(*hdr))
+		return -EBADMSG;
+
+	key = request_asymmetric_key(keyring, be32_to_cpu(hdr->keyid));
+	if (IS_ERR(key))
+		return PTR_ERR(key);
 
-	return asymmetric_verify(keyring, sig, siglen, hash.digest,
-				 hash.hdr.length);
+	pk = asymmetric_key_public_key(key);
+	if (!strncmp(pk->pkey_algo, "mldsa", 5)) {
+		rc = asymmetric_verify_v3_hashless(key, pk, "raw",
+						   sig, siglen, algo, data);
+	} else {
+		rc = calc_file_id_hash(hdr->type, algo, data, &hash);
+		if (rc) {
+			rc = -EINVAL;
+			goto err_exit;
+		}
+
+		rc = asymmetric_verify_common(key, pk, sig, siglen, hash.digest,
+					      hash.hdr.length);
+	}
+
+err_exit:
+	key_put(key);
+
+	return rc;
 }
-- 
2.53.0


^ permalink raw reply related

* [PATCH v2 1/2] integrity: Refactor asymmetric_verify for reusability
From: Stefan Berger @ 2026-04-08 17:41 UTC (permalink / raw)
  To: linux-integrity, linux-security-module
  Cc: linux-kernel, zohar, roberto.sassu, ebiggers, Stefan Berger
In-Reply-To: <20260408174154.139606-1-stefanb@linux.ibm.com>

Refactor asymmetric_verify for reusability. Have it call
asymmetric_verify_common with the signature verification key and the
public_key structure as parameters. sigv3 support for ML-DSA will need to
check the public key type first to decide how to do the signature
verification and therefore will have these parameters available for
calling asymmetric_verify_common.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 security/integrity/digsig_asymmetric.c | 42 +++++++++++++++++---------
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/security/integrity/digsig_asymmetric.c b/security/integrity/digsig_asymmetric.c
index 6e68ec3becbd..e29ed73f15cd 100644
--- a/security/integrity/digsig_asymmetric.c
+++ b/security/integrity/digsig_asymmetric.c
@@ -79,18 +79,15 @@ static struct key *request_asymmetric_key(struct key *keyring, uint32_t keyid)
 	return key;
 }
 
-int asymmetric_verify(struct key *keyring, const char *sig,
-		      int siglen, const char *data, int datalen)
+static int asymmetric_verify_common(const struct key *key,
+				    const struct public_key *pk,
+				    const char *sig, int siglen,
+				    const char *data, int datalen)
 {
-	struct public_key_signature pks;
 	struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
-	const struct public_key *pk;
-	struct key *key;
+	struct public_key_signature pks;
 	int ret;
 
-	if (siglen <= sizeof(*hdr))
-		return -EBADMSG;
-
 	siglen -= sizeof(*hdr);
 
 	if (siglen != be16_to_cpu(hdr->sig_size))
@@ -99,15 +96,10 @@ int asymmetric_verify(struct key *keyring, const char *sig,
 	if (hdr->hash_algo >= HASH_ALGO__LAST)
 		return -ENOPKG;
 
-	key = request_asymmetric_key(keyring, be32_to_cpu(hdr->keyid));
-	if (IS_ERR(key))
-		return PTR_ERR(key);
-
 	memset(&pks, 0, sizeof(pks));
 
 	pks.hash_algo = hash_algo_name[hdr->hash_algo];
 
-	pk = asymmetric_key_public_key(key);
 	pks.pkey_algo = pk->pkey_algo;
 	if (!strcmp(pk->pkey_algo, "rsa")) {
 		pks.encoding = "pkcs1";
@@ -127,11 +119,33 @@ int asymmetric_verify(struct key *keyring, const char *sig,
 	pks.s_size = siglen;
 	ret = verify_signature(key, &pks);
 out:
-	key_put(key);
 	pr_debug("%s() = %d\n", __func__, ret);
 	return ret;
 }
 
+int asymmetric_verify(struct key *keyring, const char *sig,
+		      int siglen, const char *data, int datalen)
+{
+	struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
+	const struct public_key *pk;
+	struct key *key;
+	int ret;
+
+	if (siglen <= sizeof(*hdr))
+		return -EBADMSG;
+
+	key = request_asymmetric_key(keyring, be32_to_cpu(hdr->keyid));
+	if (IS_ERR(key))
+		return PTR_ERR(key);
+	pk = asymmetric_key_public_key(key);
+
+	ret = asymmetric_verify_common(key, pk, sig, siglen, data, datalen);
+
+	key_put(key);
+
+	return ret;
+}
+
 /*
  * calc_file_id_hash - calculate the hash of the ima_file_id struct data
  * @type: xattr type [enum evm_ima_xattr_type]
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH 1/3] crypto: public_key: Remove check for valid hash_algo for ML-DSA keys
From: Stefan Berger @ 2026-04-08 17:25 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-integrity, linux-security-module, linux-kernel, zohar,
	roberto.sassu, David Howells, Lukas Wunner, Ignat Korchagin,
	keyrings, linux-crypto
In-Reply-To: <20260406165350.GD2971@sol>



On 4/6/26 12:53 PM, Eric Biggers wrote:
> On Sun, Apr 05, 2026 at 07:12:22PM -0400, Stefan Berger wrote:
>> Remove the check for the hash_algo since ML-DSA is only used in pure mode
>> and there is no relevance of a hash_algo for the input data.
>>
>> Cc: David Howells <dhowells@redhat.com>
>> Cc: Lukas Wunner <lukas@wunner.de>
>> Cc: Ignat Korchagin <ignat@linux.win>
>> Cc: keyrings@vger.kernel.org
>> Cc: linux-crypto@vger.kernel.org
>> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
>> ---
>>   crypto/asymmetric_keys/public_key.c | 5 -----
>>   1 file changed, 5 deletions(-)
>>
>> diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
>> index 09a0b83d5d77..df6918a77ab8 100644
>> --- a/crypto/asymmetric_keys/public_key.c
>> +++ b/crypto/asymmetric_keys/public_key.c
>> @@ -147,11 +147,6 @@ software_key_determine_akcipher(const struct public_key *pkey,
>>   		   strcmp(pkey->pkey_algo, "mldsa87") == 0) {
>>   		if (strcmp(encoding, "raw") != 0)
>>   			return -EINVAL;
>> -		if (!hash_algo)
>> -			return -EINVAL;
>> -		if (strcmp(hash_algo, "none") != 0 &&
>> -		    strcmp(hash_algo, "sha512") != 0)
>> -			return -EINVAL;
> 
> Does this broaden which hash algorithms are accepted for CMS signatures
> that use ML-DSA and contain signed attributes?

Right... dropping this patch and using the "none" route now.

> 
> - Eric
> 


^ permalink raw reply

* Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets
From: Justin Suess @ 2026-04-08 17:10 UTC (permalink / raw)
  To: mic
  Cc: andrii, ast, bpf, brauner, daniel, eddyz87, fred, gnoack, jack,
	jmorris, john.fastabend, kees, kpsingh, linux-fsdevel,
	linux-kernel, linux-security-module, m, martin.lau, paul,
	Justin Suess
In-Reply-To: <20260408.ong9Eshe0omu@digikod.net>


Add a flag LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS, which executes
task_set_no_new_privs on the current credentials, but only if
the process lacks the CAP_SYS_ADMIN capability.

While this operation is redundant for code running from userspace
(indeed callers may achieve the same logic by calling
prctl w/ PR_SET_NO_NEW_PRIVS), this flag enables callers without access
to the syscall abi (defined in subsequent patches) to restrict processes
from gaining additional capabilities. This is important to ensure that
consumers can meet the task_no_new_privs || CAP_SYS_ADMIN invariant
enforced by Landlock without having syscall access.

This is done by hooking bprm_committing_creds along with a
landlock_cred_security flag to indicate that the next execution should
task_set_no_new_privs if the process doesn't possess CAP_SYS_ADMIN. This
is done to ensure that task_set_no_new_privs is being done past the
point of no return.

Cc: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
---

On Wed, Apr 08, 2026 at 02:00:00 -0000, Mickaël Salaün wrote:
> > Points of Feedback
> > ===
> > 
> > First, the new set_nnp_on_point_of_no_return field in struct linux_binprm.
> > This field was needed to request that task_set_no_new_privs be set during an
> > execution, but only after the execution has proceeded beyond the point of no
> > return. I couldn't find a way to express this semantic without adding a new
> > bitfield to struct linux_binprm and a conditional in fs/exec.c. Please see
> > patch 2.

> What about using security_bprm_committing_creds()?

Good idea. Definitely cleaner.

Something like this? Then dropping the "execve: Add set_nnp_on_point_of_no_return"
commit.

This adds a bitfield to the landlock_cred_security struct to indicate that the flag
should be set on the next exec(s).

 include/uapi/linux/landlock.h | 14 ++++++++++++++
 security/landlock/cred.c      | 13 +++++++++++++
 security/landlock/cred.h      |  7 +++++++
 security/landlock/limits.h    |  2 +-
 security/landlock/ruleset.c   | 15 ++++++++++++---
 security/landlock/syscalls.c  |  5 +++++
 6 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index f88fa1f68b77..edd9d9a7f60e 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -129,12 +129,26 @@ struct landlock_ruleset_attr {
  *
  *     If the calling thread is running with no_new_privs, this operation
  *     enables no_new_privs on the sibling threads as well.
+ *
+ * %LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
+ *    Sets no_new_privs on the calling thread before applying the Landlock domain.
+ *    This flag is useful for convenience as well as for applying a ruleset from
+ *    an outside context (e.g BPF). This flag only has an effect on when both
+ *    no_new_privs isn't already set and the caller doesn't possess CAP_SYS_ADMIN.
+ *
+ *    This flag has slightly different behavior when used from BPF. Instead of
+ *    setting no_new_privs on the current task, it sets a flag on the bprm so that
+ *    no_new_privs is set on the task at exec point-of-no-return. This guarantees
+ *    that the current execution is unaffected, and may escalate as usual until the
+ *    next exec, but the resulting task cannot gain more privileges through later
+ *    exec transitions.
  */
 /* clang-format off */
 #define LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF		(1U << 0)
 #define LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON			(1U << 1)
 #define LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF		(1U << 2)
 #define LANDLOCK_RESTRICT_SELF_TSYNC				(1U << 3)
+#define LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS			(1U << 4)
 /* clang-format on */
 
 /**
diff --git a/security/landlock/cred.c b/security/landlock/cred.c
index 0cb3edde4d18..bcc9b716916f 100644
--- a/security/landlock/cred.c
+++ b/security/landlock/cred.c
@@ -43,6 +43,18 @@ static void hook_cred_free(struct cred *const cred)
 		landlock_put_ruleset_deferred(dom);
 }
 
+static void hook_bprm_committing_creds(const struct linux_binprm *bprm)
+{
+	struct landlock_cred_security *const llcred = landlock_cred(bprm->cred);
+
+	if (llcred->set_nnp_on_committing_creds &&
+	    !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN)) {
+		task_set_no_new_privs(current);
+		/* Don't need to set it again for subsequent execution. */
+		llcred->set_nnp_on_committing_creds = false;
+	}
+}
+
 #ifdef CONFIG_AUDIT
 
 static int hook_bprm_creds_for_exec(struct linux_binprm *const bprm)
@@ -55,6 +67,7 @@ static int hook_bprm_creds_for_exec(struct linux_binprm *const bprm)
 #endif /* CONFIG_AUDIT */
 
 static struct security_hook_list landlock_hooks[] __ro_after_init = {
+	LSM_HOOK_INIT(bprm_committing_creds, hook_bprm_committing_creds),
 	LSM_HOOK_INIT(cred_prepare, hook_cred_prepare),
 	LSM_HOOK_INIT(cred_transfer, hook_cred_transfer),
 	LSM_HOOK_INIT(cred_free, hook_cred_free),
diff --git a/security/landlock/cred.h b/security/landlock/cred.h
index c10a06727eb1..7ec6dd12ebc3 100644
--- a/security/landlock/cred.h
+++ b/security/landlock/cred.h
@@ -49,6 +49,13 @@ struct landlock_cred_security {
 	 * not require a current domain.
 	 */
 	u8 log_subdomains_off : 1;
+	/**
+	 * @set_nnp_on_committing_creds: Set if the domain should set NO_NEW_PRIVS on the
+	 * execution past the point of no return in security_bprm_committing_creds().
+	 * This is not a hierarchy configuration because the nnp state is inherited by
+	 * exec and doesn't need further configuration.
+	 */
+	u8 set_nnp_on_committing_creds : 1;
 #endif /* CONFIG_AUDIT */
 } __packed;
 
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index eb584f47288d..d298086a4180 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -31,7 +31,7 @@
 #define LANDLOCK_MASK_SCOPE		((LANDLOCK_LAST_SCOPE << 1) - 1)
 #define LANDLOCK_NUM_SCOPE		__const_hweight64(LANDLOCK_MASK_SCOPE)
 
-#define LANDLOCK_LAST_RESTRICT_SELF	LANDLOCK_RESTRICT_SELF_TSYNC
+#define LANDLOCK_LAST_RESTRICT_SELF	LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
 #define LANDLOCK_MASK_RESTRICT_SELF	((LANDLOCK_LAST_RESTRICT_SELF << 1) - 1)
 
 /* clang-format on */
diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index 1d6fa74f2a52..ad0bd5994ec5 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -121,11 +121,13 @@ int landlock_restrict_cred_precheck(const __u32 flags,
 
 	/*
 	 * Similar checks as for seccomp(2), except that an -EPERM may be
-	 * returned.
+	 * returned, or no_new_privs may be set by the caller via
+	 * LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS.
 	 */
 	if (!task_no_new_privs(current) &&
 	    !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN)) {
-		return -EPERM;
+		if (!(flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS))
+			return -EPERM;
 	}
 
 	if (flags & ~LANDLOCK_MASK_RESTRICT_SELF)
@@ -140,7 +142,7 @@ int landlock_restrict_cred(struct cred *const cred,
 {
 	struct landlock_cred_security *new_llcred;
 	bool __maybe_unused log_same_exec, log_new_exec, log_subdomains,
-		prev_log_subdomains;
+		prev_log_subdomains, set_nnp_on_committing_creds;
 
 	/*
 	 * It is allowed to set LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF without
@@ -157,6 +159,12 @@ int landlock_restrict_cred(struct cred *const cred,
 	log_new_exec = !!(flags & LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON);
 	/* Translates "off" flag to boolean. */
 	log_subdomains = !(flags & LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF);
+	/*
+	 * Translates "on" flag to boolean. This flag is not inherited by exec,
+	 * but the resulting nnp state is.
+	 */
+	set_nnp_on_committing_creds =
+		!!(flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS);
 
 	new_llcred = landlock_cred(cred);
 
@@ -165,6 +173,7 @@ int landlock_restrict_cred(struct cred *const cred,
 	new_llcred->log_subdomains_off = !prev_log_subdomains ||
 					 !log_subdomains;
 #endif /* CONFIG_AUDIT */
+	new_llcred->set_nnp_on_committing_creds = set_nnp_on_committing_creds;
 
 	/*
 	 * The only case when a ruleset may not be set is if
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index c6c7be7698a2..f3520c764360 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -397,6 +397,7 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
  *         - %LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON
  *         - %LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
  *         - %LANDLOCK_RESTRICT_SELF_TSYNC
+ *         - %LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
  *
  * This system call enforces a Landlock ruleset on the current thread.
  * Enforcing a ruleset requires that the task has %CAP_SYS_ADMIN in its
@@ -450,6 +451,10 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 	if (!new_cred)
 		return -ENOMEM;
 
+	if (flags & LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS &&
+	    !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN))
+		task_set_no_new_privs(current);
+
 	err = landlock_restrict_cred(new_cred, ruleset, flags);
 	if (err) {
 		abort_creds(new_cred);
-- 
2.53.0


^ permalink raw reply related

* Re: [RFC PATCH 00/20] BPF interface for applying Landlock rulesets
From: Mickaël Salaün @ 2026-04-08 14:00 UTC (permalink / raw)
  To: Justin Suess
  Cc: ast, daniel, andrii, kpsingh, paul, viro, brauner, kees, gnoack,
	jack, jmorris, serge, song, yonghong.song, martin.lau, m, eddyz87,
	john.fastabend, sdf, skhan, bpf, linux-security-module,
	linux-kernel, linux-fsdevel, Frederick Lawler
In-Reply-To: <20260407200157.3874806-1-utilityemal77@gmail.com>

Thanks for this RFC.

On Tue, Apr 07, 2026 at 04:01:22PM -0400, Justin Suess wrote:
> Hello,
> 
> This series lets sleepable BPF LSM programs apply an existing,
> userspace-created Landlock ruleset to a program during exec.
> 
> The goal is not to move Landlock policy definition into BPF, nor to create a
> second policy engine.  Instead, BPF is used only to select when an already
> valid Landlock ruleset should be applied, based on runtime exec context.
> 
> Background
> ===
> 
> Landlock is primarily a syscall-driven, unprivileged-first LSM.  That model
> works well when the application being sandboxed can create and enforce its own
> rulesets, or when a trusted launcher can impose restrictions directly before
> running a trusted target.
> 
> That becomes harder when the target program is not under first-party control,
> for example:
> 
> 1. third-party binaries,
> 2. unmodified container images,
> 3. programs reached through shells, wrappers, or service managers, and
> 4. user-supplied or otherwise untrusted code.
> 
> In these cases, an external supervisor may want to apply a Landlock ruleset to
> the final executed program, while leaving unrelated parents or helper
> processes alone.
> 
> Why external sandboxing is awkward today
> ===
> 
> There are two recurring problems.
> 
> First, userspace cannot reliably predict every file a target may need across
> different systems, packaging layouts, and runtime conditions.  Shared
> libraries, configuration files, interpreters, and helper binaries often depend
> on details that are only known at runtime.

Agreed, it would make sense to leverage eBPF for this context
identification rather than implementing a Landlock-specfic feature.

> 
> Second, Landlock inheritance is intentionally one-way.  Once a task is
> restricted, descendants inherit that domain and may only become more
> restricted.  This is exactly what Landlock should do, but it makes external
> sandboxing awkward when the program of interest is buried inside a larger exec
> chain.  Applying restrictions too early can affect unrelated intermediates;
> applying them too late misses the target entirely.

This makes sense too.

> 
> This series addresses that target-selection problem.
> 
> Overview
> ===
> 
> This series adds a small BPF-to-Landlock bridge:
> 
> 1. userspace creates a normal Landlock ruleset through the existing ABI;
> 2. userspace inserts that ruleset FD into a new
> 	BPF_MAP_TYPE_LANDLOCK_RULESET map;
> 3. a sleepable BPF LSM program attached to an exec-time hook looks up the
> 	ruleset; and
> 4. the program calls a kfunc to apply that ruleset to the new program's
> 	credentials before exec completes.
> 
> The important point is that BPF does not create, inspect, or mutate Landlock
> policy here.  It only decides whether to apply a ruleset that was already
> created and validated through Landlock's existing userspace API.

I like this approach.  It makes it possible for users enforce Landlock
security policies on arbitrary new executions.  Sandboxing at this
specific point is the best time because it ensures a consistency for the
whole lifetime of the process, whereas applying new restriction in the
middle of an execution would make the process unstable (if the request
doesn't come from the process itself).

> 
> Interface
> ===
> 
> The series adds:
> 
> 1. bpf_landlock_restrict_binprm(), which applies a referenced ruleset to
> 	struct linux_binprm credentials;
> 2. bpf_landlock_put_ruleset(), which releases a referenced ruleset; and
> 3. BPF_MAP_TYPE_LANDLOCK_RULESET, a specialized map type for holding
> 	references to Landlock rulesets originating from userspace file
> 	descriptors.
> 4. A new field in the linux_binprm struct to enable application of
>    task_set_no_new_privs once execution is beyond the point of no return.

This "beyond the point of no return" is indeed important, and it would
be nice to also have this property for Landlock restriction i.e., only
create a Landlock domain if we know that the execution will succeed (or
if the caller will exit).  This is especially important for
logging/tracing event consistency.

> 
> The kfuncs are restricted to sleepable BPF LSM programs attached to
> bprm_creds_for_exec and bprm_creds_from_file, which are the points where the
> new program's credentials may still be updated safely.
> 
> This series also adds LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS.  On the BPF path,
> this is staged through the exec context and committed only after exec reaches
> point-of-no-return.  This avoids side effects on failed executions while
> ensuring that the resulting task cannot gain more privileges through later exec
> transitions. This is done through the set_nnp_on_point_of_no_return field.
> 
> This has a little subtlety: LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS in the BPF
> path will not stop the current execution from escalating at all; only subsequent
> ones.

This makes sense too, but it needs to be documented.

> This is intentional to allow landlock policies to be applied through a

s/landlock/Landlock/g in every text/comment/commit description please.

> setuid transition for instance, without affecting the current escalation.
> 
> Semantics
> ===
> 
> This proposal is intended to preserve Landlock semantics as much as practical
> for an exec-time BPF attachment model:
> 
> 1. only pre-existing Landlock rulesets may be applied;
> 2. BPF cannot construct, inspect, or modify rulesets;

Inspection will be possible with tracepoints, but it is orthogonal to
this series.

> 3. enforcement still happens before the new program begins execution;
> 4. normal Landlock inheritance, layering, and future composition remain
> 	unchanged; and
> 5. this does not bypass Landlock's privilege checks for applying Landlock
>     rulesets.
> 
> In other words, BPF acts as an external selector for when to apply Landlock,
> not as a replacement for Landlock's enforcement engine.
> 
> All behavior, future access rights, and previous access rights are designed
> to automatically be supported from either BPF or existing syscall contexts.
> 
> The main semantic difference is LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS on the BPF
> path: it guarantees that the resulting task is pinned with no_new_privs before
> it can perform later exec transitions, but it does not retroactively suppress
> privilege gain for the current exec transition itself.
> 
> The other exception to semantics is the LANDLOCK_RESTRICT_SELF_TSYNC flag.
> (see Points of Feedback section)
> 
> Patch layout
> ===
> 
> Patches 1-5 prepare the Landlock side by moving shared ruleset logic out of
> syscalls.c, adding a no_new_privs flag for non-syscall callers, exposing
> linux_binprm->set_nnp_on_point_of_no_return as an interface to set no_new_privs
> on the point of no return, and making deferred ruleset destruction RCU-safe.
> 
> Patches 6-10 add the BPF-facing pieces: the Landlock kfuncs, the new map type,
> syscall handling for that map, and verifier support.
> 
> Patches 11-15 add selftests and the small bpftool update needed for the new
> map type.
> 
> Patches 16-20 add docs and bump the ABI version and update MAINTAINERS.
> 
> Feedback is especially welcome on the overall interface shape, the choice of
> hooks, and the map semantics.

I'll review each patch separately, but this approach is promising.

I think it would be simpler to have a dedicated patch series for
LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS, and then send another series
specific to the eBPF side (kfunc, tests, doc...).  I'm not sure what is
the best way to deal with dependencies across Landlock and BPF though.
What is the policy for BPF next wrt other next branches?

> 
> Testing
> ===
> 
> This patch series has two portions of tests.
> 
> One lives in the traditional Landlock selftests, for the new
> LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS flag.
> 
> The other suite lives under the BPF selftests, and this tests the Landlock
> kfuncs and the new BPF_MAP_TYPE_LANDLOCK_RULESET.
> 
> This patch series was run through BPF CI, the results of which are here. [1]
> 
> All mentioned tests are passing, as well as the BPF CI.
> 
> [1] : https://github.com/kernel-patches/bpf/pull/11562
> 
> Points of Feedback
> ===
> 
> First, the new set_nnp_on_point_of_no_return field in struct linux_binprm.
> This field was needed to request that task_set_no_new_privs be set during an
> execution, but only after the execution has proceeded beyond the point of no
> return. I couldn't find a way to express this semantic without adding a new
> bitfield to struct linux_binprm and a conditional in fs/exec.c. Please see
> patch 2.

What about using security_bprm_committing_creds()?

> 
> Feedback on the BPF testing harness, which was generated with AI assistance as
> disclosed in the commit footer, is welcomed. I have only limited familiarity
> with BPF testing practices. These tests were made with strong human supervision.
> See patches 14 and 15.
> 
> Feedback on the NO_NEW_PRIVS situation is also welcomed. Because task_set_no_new_privs()
> would otherwise leak state on failed executions or AT_EXECVE_CHECK, this series
> stages no_new_privs through the exec context and only commits it after
> point-of-no-return. This preserves failure behavior while still ensuring that
> the resulting task cannot elevate further through later exec transitions.
> When called from bprm_creds_from_file, this does not retroactively change the
> privilege outcome of the current exec transition itself.
> 
> See patch 2 and 3.
> 
> Next, the RCU in the landlock_ruleset. Existing BPF maps use RCU to make sure maps
> holding references stay valid. I altered the landlock ruleset to use rcu_work
> to make sure that the rcu is synchronized before putting on a ruleset, and
> acquire the rcu in the arraymap implementation. See patches 5-10.
> 
> Next, the semantics of the map. What operations should be supported from BPF
> and userspace and what data types should they return? I consider the struct
> bpf_landlock_ruleset to be opaque. Userspace can add items to the map via the
> fd, delete items by their index, and BPF can delete and lookup items by their
> index. Items cannot be updated, only swapped.
> 
> Finally, the handling of the LANDLOCK_RESTRICT_SELF_TSYNC flag. This flag has
> no meaning in a pre-execution context, as the credentials during the designated
> LSM hooks (bprm_creds_for_exec/creds_from_file) still represent the pre-execution
> task. Therefore, this flag is invalidated and attempting to use it with
> bpf_landlock_restrict_binprm will return -EINVAL. Otherwise, the flag would
> result in applying the landlock ruleset to the wrong target in addition to the
> intended one. (see patch 2). This behavior is validated with selftests.
> 
> Existing works / Credits
> ===
> 
> Mickaël Salaün created patchsets adding BPF tracepoints for landlock in [2] [3].
> 
> Mickaël also gave feedback on this feature and the idea in this GitHub thread. [4]
> 
> Günther Noack initially received and provided initial feedback on this idea as
> an early prototype.
> 
> Liz Rice, author of "Learning eBPF: Programming the Linux Kernel for Enhanced
> Observability, Networking, and Security" provided background and inspired me to
> experiment with BPF and the BPF LSM. [5]
> 
> [2] : https://lore.kernel.org/all/20250523165741.693976-1-mic@digikod.net/
> [3] : https://lore.kernel.org/linux-security-module/20260406143717.1815792-1-mic@digikod.net/
> [4] : https://github.com/landlock-lsm/linux/issues/56
> [5] : https://wellesleybooks.com/book/9781098135126
> 
> Kind Regards,
> Justin Suess
> 
> Justin Suess (20):
>   landlock: Move operations from syscall into ruleset code
>   execve: Add set_nnp_on_point_of_no_return
>   landlock: Implement LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
>   selftests/landlock: Cover LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
>   landlock: Make ruleset deferred free RCU safe
>   bpf: lsm: Add Landlock kfuncs
>   bpf: arraymap: Implement Landlock ruleset map
>   bpf: Add Landlock ruleset map type
>   bpf: syscall: Handle Landlock ruleset maps
>   bpf: verifier: Add Landlock ruleset map support
>   selftests/bpf: Add Landlock kfunc declarations
>   selftests/landlock: Rename gettid wrapper for BPF reuse
>   selftests/bpf: Enable Landlock in selftests kernel.
>   selftests/bpf: Add Landlock kfunc test program
>   selftests/bpf: Add Landlock kfunc test runner
>   landlock: Bump ABI version
>   tools: bpftool: Add documentation for landlock_ruleset
>   landlock: Document LANDLOCK_RESTRICT_SELF_NO_NEW_PRIVS
>   bpf: Document BPF_MAP_TYPE_LANDLOCK_RULESET
>   MAINTAINERS: update entry for the Landlock subsystem
> 
>  Documentation/bpf/map_landlock_ruleset.rst    | 181 +++++
>  Documentation/userspace-api/landlock.rst      |  22 +-
>  MAINTAINERS                                   |   4 +
>  fs/exec.c                                     |   8 +
>  include/linux/binfmts.h                       |   7 +-
>  include/linux/bpf_lsm.h                       |  15 +
>  include/linux/bpf_types.h                     |   1 +
>  include/linux/landlock.h                      |  92 +++
>  include/uapi/linux/bpf.h                      |   1 +
>  include/uapi/linux/landlock.h                 |  14 +
>  kernel/bpf/arraymap.c                         |  67 ++
>  kernel/bpf/bpf_lsm.c                          | 145 ++++
>  kernel/bpf/syscall.c                          |   4 +-
>  kernel/bpf/verifier.c                         |  15 +-
>  samples/landlock/sandboxer.c                  |   7 +-
>  security/landlock/limits.h                    |   2 +-
>  security/landlock/ruleset.c                   | 198 ++++-
>  security/landlock/ruleset.h                   |  25 +-
>  security/landlock/syscalls.c                  | 158 +---
>  .../bpf/bpftool/Documentation/bpftool-map.rst |   2 +-
>  tools/bpf/bpftool/map.c                       |   2 +-
>  tools/include/uapi/linux/bpf.h                |   1 +
>  tools/lib/bpf/libbpf.c                        |   1 +
>  tools/lib/bpf/libbpf_probes.c                 |   6 +
>  tools/testing/selftests/bpf/bpf_kfuncs.h      |  20 +
>  tools/testing/selftests/bpf/config            |   5 +
>  tools/testing/selftests/bpf/config.x86_64     |   1 -
>  .../bpf/prog_tests/landlock_kfuncs.c          | 733 ++++++++++++++++++
>  .../selftests/bpf/progs/landlock_kfuncs.c     |  92 +++
>  tools/testing/selftests/landlock/base_test.c  |  10 +-
>  tools/testing/selftests/landlock/common.h     |  28 +-
>  tools/testing/selftests/landlock/fs_test.c    | 103 +--
>  tools/testing/selftests/landlock/net_test.c   |  55 +-
>  .../testing/selftests/landlock/ptrace_test.c  |  14 +-
>  .../landlock/scoped_abstract_unix_test.c      |  51 +-
>  .../selftests/landlock/scoped_base_variants.h |  23 +
>  .../selftests/landlock/scoped_common.h        |   5 +-
>  .../selftests/landlock/scoped_signal_test.c   |  30 +-
>  tools/testing/selftests/landlock/wrappers.h   |   2 +-
>  39 files changed, 1877 insertions(+), 273 deletions(-)
>  create mode 100644 Documentation/bpf/map_landlock_ruleset.rst
>  create mode 100644 include/linux/landlock.h
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/landlock_kfuncs.c
>  create mode 100644 tools/testing/selftests/bpf/progs/landlock_kfuncs.c
> 
> 
> base-commit: 8c6a27e02bc55ab110d1828610048b19f903aaec
> -- 
> 2.53.0
> 
> 

^ permalink raw reply

* Re: LSM: Whiteout chardev creation sidesteps mknod hook
From: Mickaël Salaün @ 2026-04-08 12:24 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Günther Noack, Paul Moore, linux-security-module,
	John Johansen, Georgia Garcia, Kentaro Takeda, Tetsuo Handa,
	linux-fsdevel, Alejandro Colomar
In-Reply-To: <20260408.beu1Eing5aFo@digikod.net>

CCing fsdevel and Alejandro.

On Wed, Apr 08, 2026 at 01:01:31PM +0200, Mickaël Salaün wrote:
> On Tue, Apr 07, 2026 at 03:05:13PM +0200, Günther Noack wrote:
> > Hello Christian, Paul, Mickaël and LSM maintainers!
> > 
> > I discovered the following bug in Landlock, which potentially also
> > affects other LSMs:
> > 
> > With renameat2(2)'s RENAME_WHITEOUT flag, it is possible to create a
> > "whiteout object" at the source of the rename.  Whiteout objects are
> > character devices with major/minor (0, 0) -- these devices are not
> > bound to any driver, so they are harmless, but still, the creation of
> > these files can sidestep the LANDLOCK_ACCESS_FS_MAKE_CHAR access right
> > in Landlock.
> 
> Any way to "write" on the filesystem should properly be controlled.  The
> man page says that RENAME_WHITEOUT requires CAP_MKNOD, however, looking
> at vfs_mknod(), there is an explicit exception to not check CAP_MKNOD
> for whiteout devices. See commit a3c751a50fe6 ("vfs: allow unprivileged
> whiteout creation").
> 
> > 
> > 
> > I am unconvinced which is the right fix here -- do you have an opinion
> > on this from the VFS/LSM side?
> > 
> > 
> > Option 1: Make filesystems call security_path_mknod() during RENAME_WHITEOUT?
> 
> This is the right semantic.
> 
> > 
> > Do it in the VFS rename hook.
> > 
> > * Pro: Fixes it for all LSMs
> > * Con: Call would have to be done in multiple filesystems
> 
> That would not work.
> 
> > 
> > 
> > Option 2: Handle it in security_{path,inode}_rename()
> > 
> > Make Landlock handle it in security_inode_rename() by looking for the
> > RENAME_WHITEOUT flag.
> > 
> > * Con: Operation should only be denied if the file system even
> >   implements RENAME_WHITEOUT, and we would have to maintain a list of
> >   affected filesystems for that.  (That feels like solving it at the
> >   wrong layer of abstraction.)
> 
> Why would we need to maintain such list?  If it's only about the errno,
> well, that would not be perfect be ok with a proper doc.
> 
> I'm mostly worried that there might be other (future) call paths to
> create whiteout devices.
> 
> I think option 2 would be the most practical approach for Landlock, with
> a new LANDLOCK_ACCESS_FS_MAKE_WHITEOUT right.
> 
> I'm also wondering how are the chances that other kind of special file
> type like a whiteout device could come up in the future.  Any guess
> Christian?
> 
> > * Con: Unclear whether other LSMs need a similar fix
> 
> I guess at least AppArmor and Tomoyo would consider that an issue.
> 
> > 
> > 
> > Option 3: Declare that this is working as intended?
> 
> We need to be able to controle any file creation, which is not currently
> the case because of this whiteout exception.
> 
> > 
> > * Pro: (0, 0) is not a "real" character device
> > 
> > 
> > In cases 1 and 2, we'd likely need to double check that we are not
> > breaking existing scenarios involving OverlayFS, by suddenly requiring
> > a more lax policy for creating character devices on these directories.
> > 
> > Please let me know what you think.  I'm specifically interested in:
> > 
> > 1. Christian: What is the appropriate way to do this VFS wise?
> > 2. LSM maintainers: Is this a bug that affects other LSMs as well?
> > 
> > Thanks,
> > —Günther
> > 
> > P.S.: For full transparency, I found this bug by pointing Google
> > Gemini at the Landlock codebase.
> > 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox