Linux Security Modules development
 help / color / mirror / Atom feed
* [PATCH v4 2/7] landlock: Add UDP connect() access control
From: Matthieu Buffet @ 2026-05-02 12:43 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	konstantin.meskhidze, Tingmao Wang, netdev, Matthieu Buffet
In-Reply-To: <20260502124306.3975990-1-matthieu@buffet.re>

Add support for a second fine-grained UDP access right.
This first half of LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP controls the
ability to set the remote port of a socket (via connect()). It will be
useful for applications that send datagrams, and for some servers too
(those creating per-client sockets, which want to receive traffic only
from a specific address).

Similarly as for bind(), this access control is performed when
configuring sockets, not in hot code paths.

Include detection of when autobind is about to be required, and check if
the process would be allowed to call bind(0) explicitly. Autobind can
only be performed when sending a first datagram, when connect()ing, and
in some splice() EOF edge case which, afaiu, can only happen after a
remote peer has been set (which is already covered).

Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
---
 include/uapi/linux/landlock.h               | 19 +++++
 security/landlock/audit.c                   |  2 +
 security/landlock/limits.h                  |  2 +-
 security/landlock/net.c                     | 79 +++++++++++++++++----
 tools/testing/selftests/landlock/net_test.c |  5 +-
 5 files changed, 92 insertions(+), 15 deletions(-)

diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index 045b251ff1b4..22c8cc63f30e 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -378,11 +378,30 @@ struct landlock_net_port_attr {
  *
  * - %LANDLOCK_ACCESS_NET_BIND_UDP: Bind UDP sockets to the given local
  *   port. Support added in Landlock ABI version 10.
+ * - %LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP: Set the remote port of UDP
+ *   sockets to the given port, or send datagrams to the given remote port
+ *   ignoring any destination pre-set on a socket. Support added in
+ *   Landlock ABI version 10.
+ *
+ * .. note:: Setting a remote address or sending a first datagram
+ *   auto-binds UDP sockets to an ephemeral local source port if not
+ *   already bound. To allow this if both %LANDLOCK_ACCESS_NET_BIND_UDP
+ *   and %LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP are handled, you need to
+ *   either:
+ *
+ *   - use a socket already bound to a port before the ruleset started
+ *     being enforced;
+ *   - or grant %LANDLOCK_ACCESS_NET_BIND_UDP on port 0, meaning "any
+ *     port in the ephemeral port range";
+ *   - or grant %LANDLOCK_ACCESS_NET_BIND_UDP on a specific port, and
+ *     call :manpage:`bind(2)` on that port before trying to
+ *     :manpage:`connect(2)` or send datagrams.
  */
 /* clang-format off */
 #define LANDLOCK_ACCESS_NET_BIND_TCP			(1ULL << 0)
 #define LANDLOCK_ACCESS_NET_CONNECT_TCP			(1ULL << 1)
 #define LANDLOCK_ACCESS_NET_BIND_UDP			(1ULL << 2)
+#define LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP		(1ULL << 3)
 /* clang-format on */
 
 /**
diff --git a/security/landlock/audit.c b/security/landlock/audit.c
index e676ebffeebe..851647197a01 100644
--- a/security/landlock/audit.c
+++ b/security/landlock/audit.c
@@ -46,6 +46,8 @@ static const char *const net_access_strings[] = {
 	[BIT_INDEX(LANDLOCK_ACCESS_NET_BIND_TCP)] = "net.bind_tcp",
 	[BIT_INDEX(LANDLOCK_ACCESS_NET_CONNECT_TCP)] = "net.connect_tcp",
 	[BIT_INDEX(LANDLOCK_ACCESS_NET_BIND_UDP)] = "net.bind_udp",
+	[BIT_INDEX(LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP)] =
+		"net.connect_send_udp",
 };
 
 static_assert(ARRAY_SIZE(net_access_strings) == LANDLOCK_NUM_ACCESS_NET);
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index c0f30a4591b8..a4d908b240a2 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -23,7 +23,7 @@
 #define LANDLOCK_MASK_ACCESS_FS		((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
 #define LANDLOCK_NUM_ACCESS_FS		__const_hweight64(LANDLOCK_MASK_ACCESS_FS)
 
-#define LANDLOCK_LAST_ACCESS_NET	LANDLOCK_ACCESS_NET_BIND_UDP
+#define LANDLOCK_LAST_ACCESS_NET	LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP
 #define LANDLOCK_MASK_ACCESS_NET	((LANDLOCK_LAST_ACCESS_NET << 1) - 1)
 #define LANDLOCK_NUM_ACCESS_NET		__const_hweight64(LANDLOCK_MASK_ACCESS_NET)
 
diff --git a/security/landlock/net.c b/security/landlock/net.c
index f9ccb52e7d45..045881f81295 100644
--- a/security/landlock/net.c
+++ b/security/landlock/net.c
@@ -68,16 +68,17 @@ static int current_check_access_socket(struct socket *const sock,
 
 	switch (address->sa_family) {
 	case AF_UNSPEC:
-		if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP) {
+		if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP ||
+		    access_request == LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP) {
 			/*
 			 * Connecting to an address with AF_UNSPEC dissolves
-			 * the TCP association, which have the same effect as
-			 * closing the connection while retaining the socket
-			 * object (i.e., the file descriptor).  As for dropping
-			 * privileges, closing connections is always allowed.
-			 *
-			 * For a TCP access control system, this request is
-			 * legitimate. Let the network stack handle potential
+			 * the remote association while retaining the socket
+			 * object (i.e., the file descriptor). For TCP, it has
+			 * the same effect as closing the connection. For UDP,
+			 * it removes any preset remote address. As for
+			 * dropping privileges, these actions are always
+			 * allowed.
+			 * Let the network stack handle potential
 			 * inconsistencies and return -EINVAL if needed.
 			 */
 			return 0;
@@ -134,7 +135,8 @@ static int current_check_access_socket(struct socket *const sock,
 		addr4 = (struct sockaddr_in *)address;
 		port = addr4->sin_port;
 
-		if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP) {
+		if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP ||
+		    access_request == LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP) {
 			audit_net.dport = port;
 			audit_net.v4info.daddr = addr4->sin_addr.s_addr;
 		} else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP ||
@@ -157,7 +159,8 @@ static int current_check_access_socket(struct socket *const sock,
 		addr6 = (struct sockaddr_in6 *)address;
 		port = addr6->sin6_port;
 
-		if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP) {
+		if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP ||
+		    access_request == LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP) {
 			audit_net.dport = port;
 			audit_net.v6info.daddr = addr6->sin6_addr;
 		} else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP ||
@@ -213,6 +216,50 @@ static int current_check_access_socket(struct socket *const sock,
 	return -EACCES;
 }
 
+static int current_check_autobind_udp_socket(struct socket *const sock)
+{
+	struct sockaddr_storage port0 = { 0 };
+
+	/*
+	 * On UDP sockets, if a local port has not already been bound,
+	 * calling connect() or sending a first datagram has the side
+	 * effect of autobinding an ephemeral port: we also have to check
+	 * that the process would have had the right to bind(0) explicitly.
+	 * Note: socket is not locked, so another thread could do an
+	 * explicit bind(!=0) on this socket, changing inet_num to non-zero
+	 * after we read it, but this would only have us enforce an
+	 * additional bind(0) access check and would not bypass policy.
+	 */
+	if (inet_sk(sock->sk)->inet_num != 0)
+		return 0;
+
+	/*
+	 * Construct a struct sockaddr* with port 0 to pretend the
+	 * process tried to bind() on that address.
+	 */
+	port0.ss_family = sock->sk->__sk_common.skc_family;
+	switch (port0.ss_family) {
+	case AF_INET: {
+		((struct sockaddr_in *)&port0)->sin_port = 0;
+		break;
+	}
+
+#if IS_ENABLED(CONFIG_IPV6)
+	case AF_INET6: {
+		((struct sockaddr_in6 *)&port0)->sin6_port = 0;
+		break;
+	}
+#endif /* IS_ENABLED(CONFIG_IPV6) */
+
+	default:
+		return 0;
+	}
+
+	return current_check_access_socket(sock, (struct sockaddr *)&port0,
+					   sizeof(port0),
+					   LANDLOCK_ACCESS_NET_BIND_UDP);
+}
+
 static int hook_socket_bind(struct socket *const sock,
 			    struct sockaddr *const address, const int addrlen)
 {
@@ -234,14 +281,22 @@ static int hook_socket_connect(struct socket *const sock,
 			       const int addrlen)
 {
 	access_mask_t access_request;
+	int ret = 0;
 
 	if (sk_is_tcp(sock->sk))
 		access_request = LANDLOCK_ACCESS_NET_CONNECT_TCP;
+	else if (sk_is_udp(sock->sk))
+		access_request = LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP;
 	else
 		return 0;
 
-	return current_check_access_socket(sock, address, addrlen,
-					   access_request);
+	ret = current_check_access_socket(sock, address, addrlen,
+					  access_request);
+
+	if (ret == 0 && sk_is_udp(sock->sk))
+		ret = current_check_autobind_udp_socket(sock);
+
+	return ret;
 }
 
 static struct security_hook_list landlock_hooks[] __ro_after_init = {
diff --git a/tools/testing/selftests/landlock/net_test.c b/tools/testing/selftests/landlock/net_test.c
index ec392d971ea3..016c7277e370 100644
--- a/tools/testing/selftests/landlock/net_test.c
+++ b/tools/testing/selftests/landlock/net_test.c
@@ -1326,12 +1326,13 @@ FIXTURE_TEARDOWN(mini)
 
 /* clang-format off */
 
-#define ACCESS_LAST LANDLOCK_ACCESS_NET_BIND_UDP
+#define ACCESS_LAST LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP
 
 #define ACCESS_ALL ( \
 	LANDLOCK_ACCESS_NET_BIND_TCP | \
 	LANDLOCK_ACCESS_NET_CONNECT_TCP | \
-	LANDLOCK_ACCESS_NET_BIND_UDP)
+	LANDLOCK_ACCESS_NET_BIND_UDP | \
+	LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP)
 
 /* clang-format on */
 
-- 
2.39.5


^ permalink raw reply related

* [PATCH v4 3/7] landlock: Add UDP send access control
From: Matthieu Buffet @ 2026-05-02 12:43 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	konstantin.meskhidze, Tingmao Wang, netdev, Matthieu Buffet
In-Reply-To: <20260502124306.3975990-1-matthieu@buffet.re>

Add the second half of LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP: control the
ability to specify an explicit destination when sending a datagram, to
override any remote peer set on a UDP socket (in sendto(), sendmsg(), and
sendmmsg()). It will make the right useful for clients which want to
send datagrams while specifying a destination address each time.

Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
---
 include/uapi/linux/landlock.h |  4 ++
 security/landlock/net.c       | 70 ++++++++++++++++++++++++++++++++---
 2 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index 22c8cc63f30e..b147223efc97 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -396,6 +396,10 @@ struct landlock_net_port_attr {
  *   - or grant %LANDLOCK_ACCESS_NET_BIND_UDP on a specific port, and
  *     call :manpage:`bind(2)` on that port before trying to
  *     :manpage:`connect(2)` or send datagrams.
+ *
+ * .. note:: Sending datagrams to an ``AF_UNSPEC`` destination address
+ *   family is not supported for IPv6 UDP sockets: you will need to use a
+ *   ``NULL`` address instead.
  */
 /* clang-format off */
 #define LANDLOCK_ACCESS_NET_BIND_TCP			(1ULL << 0)
diff --git a/security/landlock/net.c b/security/landlock/net.c
index 045881f81295..8a53aebdb8c6 100644
--- a/security/landlock/net.c
+++ b/security/landlock/net.c
@@ -44,7 +44,8 @@ int landlock_append_net_rule(struct landlock_ruleset *const ruleset,
 static int current_check_access_socket(struct socket *const sock,
 				       struct sockaddr *const address,
 				       const int addrlen,
-				       access_mask_t access_request)
+				       access_mask_t access_request,
+				       bool connecting)
 {
 	__be16 port;
 	struct layer_access_masks layer_masks = {};
@@ -69,7 +70,8 @@ static int current_check_access_socket(struct socket *const sock,
 	switch (address->sa_family) {
 	case AF_UNSPEC:
 		if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP ||
-		    access_request == LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP) {
+		    (access_request == LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP &&
+		     connecting)) {
 			/*
 			 * Connecting to an address with AF_UNSPEC dissolves
 			 * the remote association while retaining the socket
@@ -82,6 +84,35 @@ static int current_check_access_socket(struct socket *const sock,
 			 * inconsistencies and return -EINVAL if needed.
 			 */
 			return 0;
+		} else if (access_request ==
+			   LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP) {
+			if (sock->sk->__sk_common.skc_family == AF_INET6) {
+				/*
+				 * We cannot allow sending UDP datagrams to an
+				 * explicit AF_UNSPEC address on IPv6 sockets,
+				 * even if AF_UNSPEC is treated as "no address"
+				 * on such sockets (so it should always be allowed).
+				 * That's because the socket's family can change under
+				 * our feet (if another thread calls setsockopt(IPV6_ADDRFORM))
+				 * to IPv4, which would then treat AF_UNSPEC as
+				 * AF_INET.
+				 */
+				audit_net.family = AF_UNSPEC;
+				landlock_init_layer_masks(
+					subject->domain, access_request,
+					&layer_masks, LANDLOCK_KEY_NET_PORT);
+				landlock_log_denial(
+					subject,
+					&(struct landlock_request){
+						.type = LANDLOCK_REQUEST_NET_ACCESS,
+						.audit.type =
+							LSM_AUDIT_DATA_NET,
+						.audit.u.net = &audit_net,
+						.access = access_request,
+						.layer_masks = &layer_masks,
+					});
+				return -EACCES;
+			}
 		} else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP ||
 			   access_request == LANDLOCK_ACCESS_NET_BIND_UDP) {
 			/*
@@ -124,7 +155,10 @@ static int current_check_access_socket(struct socket *const sock,
 		} else {
 			WARN_ON_ONCE(1);
 		}
-		/* Only for bind(AF_UNSPEC+INADDR_ANY) on IPv4 socket. */
+		/*
+		 * For bind(AF_UNSPEC+INADDR_ANY) on IPv4 socket and
+		 * for sending to AF_UNSPEC addresses on IPv4 socket.
+		 */
 		fallthrough;
 	case AF_INET: {
 		const struct sockaddr_in *addr4;
@@ -257,7 +291,7 @@ static int current_check_autobind_udp_socket(struct socket *const sock)
 
 	return current_check_access_socket(sock, (struct sockaddr *)&port0,
 					   sizeof(port0),
-					   LANDLOCK_ACCESS_NET_BIND_UDP);
+					   LANDLOCK_ACCESS_NET_BIND_UDP, false);
 }
 
 static int hook_socket_bind(struct socket *const sock,
@@ -273,7 +307,7 @@ static int hook_socket_bind(struct socket *const sock,
 		return 0;
 
 	return current_check_access_socket(sock, address, addrlen,
-					   access_request);
+					   access_request, false);
 }
 
 static int hook_socket_connect(struct socket *const sock,
@@ -291,7 +325,7 @@ static int hook_socket_connect(struct socket *const sock,
 		return 0;
 
 	ret = current_check_access_socket(sock, address, addrlen,
-					  access_request);
+					  access_request, true);
 
 	if (ret == 0 && sk_is_udp(sock->sk))
 		ret = current_check_autobind_udp_socket(sock);
@@ -299,9 +333,33 @@ static int hook_socket_connect(struct socket *const sock,
 	return ret;
 }
 
+static int hook_socket_sendmsg(struct socket *const sock,
+			       struct msghdr *const msg, const int size)
+{
+	struct sockaddr *const address = msg->msg_name;
+	const int addrlen = msg->msg_namelen;
+	access_mask_t access_request;
+	int ret = 0;
+
+	if (sk_is_udp(sock->sk))
+		access_request = LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP;
+	else
+		return 0;
+
+	if (address != NULL)
+		ret = current_check_access_socket(sock, address, addrlen,
+						  access_request, false);
+
+	if (ret == 0)
+		ret = current_check_autobind_udp_socket(sock);
+
+	return ret;
+}
+
 static struct security_hook_list landlock_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(socket_bind, hook_socket_bind),
 	LSM_HOOK_INIT(socket_connect, hook_socket_connect),
+	LSM_HOOK_INIT(socket_sendmsg, hook_socket_sendmsg),
 };
 
 __init void landlock_add_net_hooks(void)
-- 
2.39.5


^ permalink raw reply related

* [PATCH v4 4/7] selftests/landlock: Add UDP bind/connect tests
From: Matthieu Buffet @ 2026-05-02 12:43 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	konstantin.meskhidze, Tingmao Wang, netdev, Matthieu Buffet
In-Reply-To: <20260502124306.3975990-1-matthieu@buffet.re>

Make basic changes to the existing bind() and connect() test suite to
cover UDP restriction.

Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
---
 tools/testing/selftests/landlock/net_test.c | 488 ++++++++++++++++----
 1 file changed, 401 insertions(+), 87 deletions(-)

diff --git a/tools/testing/selftests/landlock/net_test.c b/tools/testing/selftests/landlock/net_test.c
index 016c7277e370..568a6ed7139c 100644
--- a/tools/testing/selftests/landlock/net_test.c
+++ b/tools/testing/selftests/landlock/net_test.c
@@ -35,6 +35,7 @@ enum sandbox_type {
 	NO_SANDBOX,
 	/* This may be used to test rules that allow *and* deny accesses. */
 	TCP_SANDBOX,
+	UDP_SANDBOX,
 };
 
 static int set_service(struct service_fixture *const srv,
@@ -93,23 +94,53 @@ static bool prot_is_tcp(const struct protocol_variant *const prot)
 	       (prot->protocol == IPPROTO_TCP || prot->protocol == IPPROTO_IP);
 }
 
+static bool prot_is_udp(const struct protocol_variant *const prot)
+{
+	return (prot->domain == AF_INET || prot->domain == AF_INET6) &&
+	       prot->type == SOCK_DGRAM &&
+	       (prot->protocol == IPPROTO_UDP || prot->protocol == IPPROTO_IP);
+}
+
 static bool is_restricted(const struct protocol_variant *const prot,
 			  const enum sandbox_type sandbox)
 {
 	if (sandbox == TCP_SANDBOX)
 		return prot_is_tcp(prot);
+	else if (sandbox == UDP_SANDBOX)
+		return prot_is_udp(prot);
 	return false;
 }
 
 static int socket_variant(const struct service_fixture *const srv)
 {
+	/* Arbitrary value just to not block other tests indefinitely. */
+	const struct timeval timeout = {
+		.tv_sec = 0,
+		.tv_usec = 100000,
+	};
+	int sockfd;
 	int ret;
 
-	ret = socket(srv->protocol.domain, srv->protocol.type | SOCK_CLOEXEC,
-		     srv->protocol.protocol);
-	if (ret < 0)
+	sockfd = socket(srv->protocol.domain, srv->protocol.type | SOCK_CLOEXEC,
+			srv->protocol.protocol);
+	if (sockfd < 0)
 		return -errno;
-	return ret;
+
+	ret = setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, &timeout,
+			 sizeof(timeout));
+	if (ret != 0) {
+		ret = -errno;
+		close(sockfd);
+		return ret;
+	}
+	ret = setsockopt(sockfd, SOL_SOCKET, SO_SNDTIMEO, &timeout,
+			 sizeof(timeout));
+	if (ret != 0) {
+		ret = -errno;
+		close(sockfd);
+		return ret;
+	}
+	return sockfd;
 }
 
 #ifndef SIN6_LEN_RFC2133
@@ -271,10 +302,9 @@ FIXTURE_VARIANT(protocol)
 
 FIXTURE_SETUP(protocol)
 {
-	const struct protocol_variant prot_unspec = {
-		.domain = AF_UNSPEC,
-		.type = SOCK_STREAM,
-	};
+	struct protocol_variant prot_unspec = variant->prot;
+
+	prot_unspec.domain = AF_UNSPEC;
 
 	disable_caps(_metadata);
 
@@ -510,6 +540,92 @@ FIXTURE_VARIANT_ADD(protocol, tcp_sandbox_with_unix_datagram) {
 	},
 };
 
+/* clang-format off */
+FIXTURE_VARIANT_ADD(protocol, udp_sandbox_with_ipv4_udp1) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_INET,
+		.type = SOCK_DGRAM,
+		.protocol = IPPROTO_UDP,
+	},
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(protocol, udp_sandbox_with_ipv4_udp2) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_INET,
+		.type = SOCK_DGRAM,
+		/* IPPROTO_IP == 0 */
+		.protocol = IPPROTO_IP,
+	},
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(protocol, udp_sandbox_with_ipv6_udp1) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_INET6,
+		.type = SOCK_DGRAM,
+		.protocol = IPPROTO_UDP,
+	},
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(protocol, udp_sandbox_with_ipv6_udp2) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_INET6,
+		.type = SOCK_DGRAM,
+		/* IPPROTO_IP == 0 */
+		.protocol = IPPROTO_IP,
+	},
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(protocol, udp_sandbox_with_ipv4_tcp) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_INET,
+		.type = SOCK_STREAM,
+	},
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(protocol, udp_sandbox_with_ipv6_tcp) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_INET6,
+		.type = SOCK_STREAM,
+	},
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(protocol, udp_sandbox_with_unix_stream) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_UNIX,
+		.type = SOCK_STREAM,
+	},
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(protocol, udp_sandbox_with_unix_datagram) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_UNIX,
+		.type = SOCK_DGRAM,
+	},
+};
+
 static void test_bind_and_connect(struct __test_metadata *const _metadata,
 				  const struct service_fixture *const srv,
 				  const bool deny_bind, const bool deny_connect)
@@ -602,7 +718,7 @@ static void test_bind_and_connect(struct __test_metadata *const _metadata,
 		ret = connect_variant(connect_fd, srv);
 		if (deny_connect) {
 			EXPECT_EQ(-EACCES, ret);
-		} else if (deny_bind) {
+		} else if (deny_bind && srv->protocol.type == SOCK_STREAM) {
 			/* No listening server. */
 			EXPECT_EQ(-ECONNREFUSED, ret);
 		} else {
@@ -641,18 +757,25 @@ static void test_bind_and_connect(struct __test_metadata *const _metadata,
 
 TEST_F(protocol, bind)
 {
-	if (variant->sandbox == TCP_SANDBOX) {
+	if (variant->sandbox == TCP_SANDBOX ||
+	    variant->sandbox == UDP_SANDBOX) {
+		const __u64 bind_access =
+			(variant->sandbox == TCP_SANDBOX ?
+				 LANDLOCK_ACCESS_NET_BIND_TCP :
+				 LANDLOCK_ACCESS_NET_BIND_UDP);
+		const __u64 conn_access =
+			(variant->sandbox == TCP_SANDBOX ?
+				 LANDLOCK_ACCESS_NET_CONNECT_TCP :
+				 LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
 		const struct landlock_ruleset_attr ruleset_attr = {
-			.handled_access_net = LANDLOCK_ACCESS_NET_BIND_TCP |
-					      LANDLOCK_ACCESS_NET_CONNECT_TCP,
+			.handled_access_net = bind_access | conn_access,
 		};
-		const struct landlock_net_port_attr tcp_bind_connect_p0 = {
-			.allowed_access = LANDLOCK_ACCESS_NET_BIND_TCP |
-					  LANDLOCK_ACCESS_NET_CONNECT_TCP,
+		const struct landlock_net_port_attr bind_connect_p0 = {
+			.allowed_access = bind_access | conn_access,
 			.port = self->srv0.port,
 		};
-		const struct landlock_net_port_attr tcp_connect_p1 = {
-			.allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP,
+		const struct landlock_net_port_attr connect_p1 = {
+			.allowed_access = conn_access,
 			.port = self->srv1.port,
 		};
 		int ruleset_fd;
@@ -664,12 +787,26 @@ TEST_F(protocol, bind)
 		/* Allows connect and bind for the first port.  */
 		ASSERT_EQ(0,
 			  landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
-					    &tcp_bind_connect_p0, 0));
+					    &bind_connect_p0, 0));
 
 		/* Allows connect and denies bind for the second port. */
 		ASSERT_EQ(0,
 			  landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
-					    &tcp_connect_p1, 0));
+					    &connect_p1, 0));
+
+		/*
+		 * For UDP sockets, allows binding to ephemeral ports
+		 * (required to connect or send a first datagram)
+		 */
+		if (variant->sandbox == UDP_SANDBOX) {
+			const struct landlock_net_port_attr bind_ephemeral = {
+				.allowed_access = bind_access,
+				.port = 0,
+			};
+			ASSERT_EQ(0, landlock_add_rule(ruleset_fd,
+						       LANDLOCK_RULE_NET_PORT,
+						       &bind_ephemeral, 0));
+		}
 
 		enforce_ruleset(_metadata, ruleset_fd);
 		EXPECT_EQ(0, close(ruleset_fd));
@@ -691,18 +828,25 @@ TEST_F(protocol, bind)
 
 TEST_F(protocol, connect)
 {
-	if (variant->sandbox == TCP_SANDBOX) {
+	if (variant->sandbox == TCP_SANDBOX ||
+	    variant->sandbox == UDP_SANDBOX) {
+		const __u64 bind_access =
+			(variant->sandbox == TCP_SANDBOX ?
+				 LANDLOCK_ACCESS_NET_BIND_TCP :
+				 LANDLOCK_ACCESS_NET_BIND_UDP);
+		const __u64 conn_access =
+			(variant->sandbox == TCP_SANDBOX ?
+				 LANDLOCK_ACCESS_NET_CONNECT_TCP :
+				 LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
 		const struct landlock_ruleset_attr ruleset_attr = {
-			.handled_access_net = LANDLOCK_ACCESS_NET_BIND_TCP |
-					      LANDLOCK_ACCESS_NET_CONNECT_TCP,
+			.handled_access_net = bind_access | conn_access,
 		};
-		const struct landlock_net_port_attr tcp_bind_connect_p0 = {
-			.allowed_access = LANDLOCK_ACCESS_NET_BIND_TCP |
-					  LANDLOCK_ACCESS_NET_CONNECT_TCP,
+		const struct landlock_net_port_attr bind_connect_p0 = {
+			.allowed_access = bind_access | conn_access,
 			.port = self->srv0.port,
 		};
-		const struct landlock_net_port_attr tcp_bind_p1 = {
-			.allowed_access = LANDLOCK_ACCESS_NET_BIND_TCP,
+		const struct landlock_net_port_attr bind_p1 = {
+			.allowed_access = bind_access,
 			.port = self->srv1.port,
 		};
 		int ruleset_fd;
@@ -714,12 +858,26 @@ TEST_F(protocol, connect)
 		/* Allows connect and bind for the first port. */
 		ASSERT_EQ(0,
 			  landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
-					    &tcp_bind_connect_p0, 0));
+					    &bind_connect_p0, 0));
 
 		/* Allows bind and denies connect for the second port. */
 		ASSERT_EQ(0,
 			  landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
-					    &tcp_bind_p1, 0));
+					    &bind_p1, 0));
+
+		/*
+		 * For UDP sockets, allows binding to ephemeral ports
+		 * (required to connect or send a first datagram)
+		 */
+		if (variant->sandbox == UDP_SANDBOX) {
+			const struct landlock_net_port_attr bind_ephemeral = {
+				.allowed_access = bind_access,
+				.port = 0,
+			};
+			ASSERT_EQ(0, landlock_add_rule(ruleset_fd,
+						       LANDLOCK_RULE_NET_PORT,
+						       &bind_ephemeral, 0));
+		}
 
 		enforce_ruleset(_metadata, ruleset_fd);
 		EXPECT_EQ(0, close(ruleset_fd));
@@ -737,16 +895,20 @@ TEST_F(protocol, connect)
 
 TEST_F(protocol, bind_unspec)
 {
+	const int bind_access = (variant->sandbox == TCP_SANDBOX ?
+					 LANDLOCK_ACCESS_NET_BIND_TCP :
+					 LANDLOCK_ACCESS_NET_BIND_UDP);
 	const struct landlock_ruleset_attr ruleset_attr = {
-		.handled_access_net = LANDLOCK_ACCESS_NET_BIND_TCP,
+		.handled_access_net = bind_access,
 	};
-	const struct landlock_net_port_attr tcp_bind = {
-		.allowed_access = LANDLOCK_ACCESS_NET_BIND_TCP,
+	const struct landlock_net_port_attr rule_bind = {
+		.allowed_access = bind_access,
 		.port = self->srv0.port,
 	};
 	int bind_fd, ret;
 
-	if (variant->sandbox == TCP_SANDBOX) {
+	if (variant->sandbox == TCP_SANDBOX ||
+	    variant->sandbox == UDP_SANDBOX) {
 		const int ruleset_fd = landlock_create_ruleset(
 			&ruleset_attr, sizeof(ruleset_attr), 0);
 		ASSERT_LE(0, ruleset_fd);
@@ -754,7 +916,7 @@ TEST_F(protocol, bind_unspec)
 		/* Allows bind. */
 		ASSERT_EQ(0,
 			  landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
-					    &tcp_bind, 0));
+					    &rule_bind, 0));
 		enforce_ruleset(_metadata, ruleset_fd);
 		EXPECT_EQ(0, close(ruleset_fd));
 	}
@@ -782,7 +944,8 @@ TEST_F(protocol, bind_unspec)
 	}
 	EXPECT_EQ(0, close(bind_fd));
 
-	if (variant->sandbox == TCP_SANDBOX) {
+	if (variant->sandbox == TCP_SANDBOX ||
+	    variant->sandbox == UDP_SANDBOX) {
 		const int ruleset_fd = landlock_create_ruleset(
 			&ruleset_attr, sizeof(ruleset_attr), 0);
 		ASSERT_LE(0, ruleset_fd);
@@ -828,11 +991,15 @@ TEST_F(protocol, bind_unspec)
 
 TEST_F(protocol, connect_unspec)
 {
+	const __u64 connect_right =
+		(variant->sandbox == TCP_SANDBOX ?
+			 LANDLOCK_ACCESS_NET_CONNECT_TCP :
+			 LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
 	const struct landlock_ruleset_attr ruleset_attr = {
-		.handled_access_net = LANDLOCK_ACCESS_NET_CONNECT_TCP,
+		.handled_access_net = connect_right,
 	};
-	const struct landlock_net_port_attr tcp_connect = {
-		.allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP,
+	const struct landlock_net_port_attr rule_connect = {
+		.allowed_access = connect_right,
 		.port = self->srv0.port,
 	};
 	int bind_fd, client_fd, status;
@@ -865,7 +1032,8 @@ TEST_F(protocol, connect_unspec)
 			EXPECT_EQ(0, ret);
 		}
 
-		if (variant->sandbox == TCP_SANDBOX) {
+		if (variant->sandbox == TCP_SANDBOX ||
+		    variant->sandbox == UDP_SANDBOX) {
 			const int ruleset_fd = landlock_create_ruleset(
 				&ruleset_attr, sizeof(ruleset_attr), 0);
 			ASSERT_LE(0, ruleset_fd);
@@ -873,7 +1041,7 @@ TEST_F(protocol, connect_unspec)
 			/* Allows connect. */
 			ASSERT_EQ(0, landlock_add_rule(ruleset_fd,
 						       LANDLOCK_RULE_NET_PORT,
-						       &tcp_connect, 0));
+						       &rule_connect, 0));
 			enforce_ruleset(_metadata, ruleset_fd);
 			EXPECT_EQ(0, close(ruleset_fd));
 		}
@@ -896,7 +1064,8 @@ TEST_F(protocol, connect_unspec)
 			EXPECT_EQ(0, ret);
 		}
 
-		if (variant->sandbox == TCP_SANDBOX) {
+		if (variant->sandbox == TCP_SANDBOX ||
+		    variant->sandbox == UDP_SANDBOX) {
 			const int ruleset_fd = landlock_create_ruleset(
 				&ruleset_attr, sizeof(ruleset_attr), 0);
 			ASSERT_LE(0, ruleset_fd);
@@ -975,6 +1144,13 @@ FIXTURE_VARIANT_ADD(ipv4, tcp_sandbox_with_tcp) {
 	.type = SOCK_STREAM,
 };
 
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ipv4, udp_sandbox_with_tcp) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.type = SOCK_STREAM,
+};
+
 /* clang-format off */
 FIXTURE_VARIANT_ADD(ipv4, no_sandbox_with_udp) {
 	/* clang-format on */
@@ -989,6 +1165,13 @@ FIXTURE_VARIANT_ADD(ipv4, tcp_sandbox_with_udp) {
 	.type = SOCK_DGRAM,
 };
 
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ipv4, udp_sandbox_with_udp) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.type = SOCK_DGRAM,
+};
+
 FIXTURE_SETUP(ipv4)
 {
 	const struct protocol_variant prot = {
@@ -1012,14 +1195,19 @@ TEST_F(ipv4, from_unix_to_inet)
 {
 	int unix_stream_fd, unix_dgram_fd;
 
-	if (variant->sandbox == TCP_SANDBOX) {
+	if (variant->sandbox == TCP_SANDBOX ||
+	    variant->sandbox == UDP_SANDBOX) {
+		const int access_rights =
+			(variant->sandbox == TCP_SANDBOX ?
+				 LANDLOCK_ACCESS_NET_BIND_TCP |
+					 LANDLOCK_ACCESS_NET_CONNECT_TCP :
+				 LANDLOCK_ACCESS_NET_BIND_UDP |
+					 LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
 		const struct landlock_ruleset_attr ruleset_attr = {
-			.handled_access_net = LANDLOCK_ACCESS_NET_BIND_TCP |
-					      LANDLOCK_ACCESS_NET_CONNECT_TCP,
+			.handled_access_net = access_rights,
 		};
 		const struct landlock_net_port_attr tcp_bind_connect_p0 = {
-			.allowed_access = LANDLOCK_ACCESS_NET_BIND_TCP |
-					  LANDLOCK_ACCESS_NET_CONNECT_TCP,
+			.allowed_access = access_rights,
 			.port = self->srv0.port,
 		};
 		int ruleset_fd;
@@ -1680,6 +1868,7 @@ TEST_F(ipv4_tcp, with_fs)
 FIXTURE(port_specific)
 {
 	struct service_fixture srv0;
+	struct service_fixture cli1;
 };
 
 FIXTURE_VARIANT(port_specific)
@@ -1699,7 +1888,7 @@ FIXTURE_VARIANT_ADD(port_specific, no_sandbox_with_ipv4) {
 };
 
 /* clang-format off */
-FIXTURE_VARIANT_ADD(port_specific, sandbox_with_ipv4) {
+FIXTURE_VARIANT_ADD(port_specific, tcp_sandbox_with_ipv4) {
 	/* clang-format on */
 	.sandbox = TCP_SANDBOX,
 	.prot = {
@@ -1708,6 +1897,16 @@ FIXTURE_VARIANT_ADD(port_specific, sandbox_with_ipv4) {
 	},
 };
 
+/* clang-format off */
+FIXTURE_VARIANT_ADD(port_specific, udp_sandbox_with_ipv4) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_INET,
+		.type = SOCK_DGRAM,
+	},
+};
+
 /* clang-format off */
 FIXTURE_VARIANT_ADD(port_specific, no_sandbox_with_ipv6) {
 	/* clang-format on */
@@ -1719,7 +1918,7 @@ FIXTURE_VARIANT_ADD(port_specific, no_sandbox_with_ipv6) {
 };
 
 /* clang-format off */
-FIXTURE_VARIANT_ADD(port_specific, sandbox_with_ipv6) {
+FIXTURE_VARIANT_ADD(port_specific, tcp_sandbox_with_ipv6) {
 	/* clang-format on */
 	.sandbox = TCP_SANDBOX,
 	.prot = {
@@ -1728,11 +1927,22 @@ FIXTURE_VARIANT_ADD(port_specific, sandbox_with_ipv6) {
 	},
 };
 
+/* clang-format off */
+FIXTURE_VARIANT_ADD(port_specific, udp_sandbox_with_ipv6) {
+	/* clang-format on */
+	.sandbox = UDP_SANDBOX,
+	.prot = {
+		.domain = AF_INET6,
+		.type = SOCK_DGRAM,
+	},
+};
+
 FIXTURE_SETUP(port_specific)
 {
 	disable_caps(_metadata);
 
 	ASSERT_EQ(0, set_service(&self->srv0, variant->prot, 0));
+	ASSERT_EQ(0, set_service(&self->cli1, variant->prot, 1));
 
 	setup_loopback(_metadata);
 };
@@ -1747,14 +1957,19 @@ TEST_F(port_specific, bind_connect_zero)
 	uint16_t port;
 
 	/* Adds a rule layer with bind and connect actions. */
-	if (variant->sandbox == TCP_SANDBOX) {
+	if (variant->sandbox == TCP_SANDBOX ||
+	    variant->sandbox == UDP_SANDBOX) {
+		const int access_rights =
+			(variant->sandbox == TCP_SANDBOX ?
+				 LANDLOCK_ACCESS_NET_BIND_TCP |
+					 LANDLOCK_ACCESS_NET_CONNECT_TCP :
+				 LANDLOCK_ACCESS_NET_BIND_UDP |
+					 LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
 		const struct landlock_ruleset_attr ruleset_attr = {
-			.handled_access_net = LANDLOCK_ACCESS_NET_BIND_TCP |
-					      LANDLOCK_ACCESS_NET_CONNECT_TCP
+			.handled_access_net = access_rights,
 		};
-		const struct landlock_net_port_attr tcp_bind_connect_zero = {
-			.allowed_access = LANDLOCK_ACCESS_NET_BIND_TCP |
-					  LANDLOCK_ACCESS_NET_CONNECT_TCP,
+		const struct landlock_net_port_attr bind_connect_zero = {
+			.allowed_access = access_rights,
 			.port = 0,
 		};
 		int ruleset_fd;
@@ -1766,7 +1981,7 @@ TEST_F(port_specific, bind_connect_zero)
 		/* Checks zero port value on bind and connect actions. */
 		EXPECT_EQ(0,
 			  landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
-					    &tcp_bind_connect_zero, 0));
+					    &bind_connect_zero, 0));
 
 		enforce_ruleset(_metadata, ruleset_fd);
 		EXPECT_EQ(0, close(ruleset_fd));
@@ -1787,11 +2002,16 @@ TEST_F(port_specific, bind_connect_zero)
 	ret = bind_variant(bind_fd, &self->srv0);
 	EXPECT_EQ(0, ret);
 
-	EXPECT_EQ(0, listen(bind_fd, backlog));
+	if (variant->prot.type == SOCK_STREAM)
+		EXPECT_EQ(0, listen(bind_fd, backlog));
 
 	/* Connects on port 0. */
 	ret = connect_variant(connect_fd, &self->srv0);
-	EXPECT_EQ(-ECONNREFUSED, ret);
+	if (variant->prot.type == SOCK_STREAM) {
+		EXPECT_EQ(-ECONNREFUSED, ret);
+	} else {
+		EXPECT_EQ(0, ret);
+	}
 
 	/* Sets binded port for both protocol families. */
 	port = get_binded_port(bind_fd, &variant->prot);
@@ -1815,23 +2035,35 @@ TEST_F(port_specific, bind_connect_1023)
 	int bind_fd, connect_fd, ret;
 
 	/* Adds a rule layer with bind and connect actions. */
-	if (variant->sandbox == TCP_SANDBOX) {
+	if (variant->sandbox == TCP_SANDBOX ||
+	    variant->sandbox == UDP_SANDBOX) {
+		const int bind_right = (variant->sandbox == TCP_SANDBOX ?
+						LANDLOCK_ACCESS_NET_BIND_TCP :
+						LANDLOCK_ACCESS_NET_BIND_UDP);
+		const int access_rights =
+			(variant->sandbox == TCP_SANDBOX ?
+				 (LANDLOCK_ACCESS_NET_BIND_TCP |
+				  LANDLOCK_ACCESS_NET_CONNECT_TCP) :
+				 (LANDLOCK_ACCESS_NET_BIND_UDP |
+				  LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP));
 		const struct landlock_ruleset_attr ruleset_attr = {
-			.handled_access_net = LANDLOCK_ACCESS_NET_BIND_TCP |
-					      LANDLOCK_ACCESS_NET_CONNECT_TCP
+			.handled_access_net = access_rights,
 		};
 		/* A rule with port value less than 1024. */
-		const struct landlock_net_port_attr tcp_bind_connect_low_range = {
-			.allowed_access = LANDLOCK_ACCESS_NET_BIND_TCP |
-					  LANDLOCK_ACCESS_NET_CONNECT_TCP,
+		const struct landlock_net_port_attr bind_connect_low_range = {
+			.allowed_access = access_rights,
 			.port = 1023,
 		};
 		/* A rule with 1024 port. */
-		const struct landlock_net_port_attr tcp_bind_connect = {
-			.allowed_access = LANDLOCK_ACCESS_NET_BIND_TCP |
-					  LANDLOCK_ACCESS_NET_CONNECT_TCP,
+		const struct landlock_net_port_attr bind_connect = {
+			.allowed_access = access_rights,
 			.port = 1024,
 		};
+		/* A rule with cli1's port, to use as source port. */
+		const struct landlock_net_port_attr srcport = {
+			.allowed_access = bind_right,
+			.port = self->cli1.port,
+		};
 		int ruleset_fd;
 
 		ruleset_fd = landlock_create_ruleset(&ruleset_attr,
@@ -1840,10 +2072,15 @@ TEST_F(port_specific, bind_connect_1023)
 
 		ASSERT_EQ(0,
 			  landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
-					    &tcp_bind_connect_low_range, 0));
+					    &bind_connect_low_range, 0));
 		ASSERT_EQ(0,
 			  landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
-					    &tcp_bind_connect, 0));
+					    &bind_connect, 0));
+		if (variant->sandbox == UDP_SANDBOX) {
+			ASSERT_EQ(0, landlock_add_rule(ruleset_fd,
+						       LANDLOCK_RULE_NET_PORT,
+						       &srcport, 0));
+		}
 
 		enforce_ruleset(_metadata, ruleset_fd);
 		EXPECT_EQ(0, close(ruleset_fd));
@@ -1867,8 +2104,19 @@ TEST_F(port_specific, bind_connect_1023)
 	ret = bind_variant(bind_fd, &self->srv0);
 	clear_cap(_metadata, CAP_NET_BIND_SERVICE);
 	EXPECT_EQ(0, ret);
-	EXPECT_EQ(0, listen(bind_fd, backlog));
+	if (variant->prot.type == SOCK_STREAM)
+		EXPECT_EQ(0, listen(bind_fd, backlog));
 
+	connect_fd = socket_variant(&self->srv0);
+	ASSERT_LE(0, connect_fd);
+	if (variant->prot.type == SOCK_DGRAM) {
+		/*
+		 * We are about to connect(), but bind() is restricted, so for
+		 * UDP sockets we need to use cli1's port as source port (the
+		 * only one we are allowed to use).
+		 */
+		EXPECT_EQ(0, bind_variant(connect_fd, &self->cli1));
+	}
 	/* Connects on the binded port 1023. */
 	ret = connect_variant(connect_fd, &self->srv0);
 	EXPECT_EQ(0, ret);
@@ -1887,7 +2135,10 @@ TEST_F(port_specific, bind_connect_1023)
 	/* Binds on port 1024. */
 	ret = bind_variant(bind_fd, &self->srv0);
 	EXPECT_EQ(0, ret);
-	EXPECT_EQ(0, listen(bind_fd, backlog));
+	if (variant->prot.type == SOCK_STREAM)
+		EXPECT_EQ(0, listen(bind_fd, backlog));
+	if (variant->prot.type == SOCK_DGRAM)
+		EXPECT_EQ(0, bind_variant(connect_fd, &self->cli1));
 
 	/* Connects on the binded port 1024. */
 	ret = connect_variant(connect_fd, &self->srv0);
@@ -1897,23 +2148,30 @@ TEST_F(port_specific, bind_connect_1023)
 	EXPECT_EQ(0, close(bind_fd));
 }
 
-static int matches_log_tcp(const int audit_fd, const char *const blockers,
-			   const char *const dir_addr, const char *const addr,
-			   const char *const dir_port)
+static int matches_auditlog(const int audit_fd, const char *const blockers,
+			    const char *const dir_addr, const char *const addr,
+			    const char *const dir_port)
 {
-	static const char log_template[] = REGEX_LANDLOCK_PREFIX
+	static const char log_with_addrport_tmpl[] = REGEX_LANDLOCK_PREFIX
 		" blockers=%s %s=%s %s=1024$";
+	static const char log_without_addrport_tmpl[] = REGEX_LANDLOCK_PREFIX
+		" blockers=%s";
 	/*
 	 * Max strlen(blockers): 16
 	 * Max strlen(dir_addr): 5
 	 * Max strlen(addr): 12
 	 * Max strlen(dir_port): 4
 	 */
-	char log_match[sizeof(log_template) + 37];
+	char log_match[sizeof(log_with_addrport_tmpl) + 37];
 	int log_match_len;
 
-	log_match_len = snprintf(log_match, sizeof(log_match), log_template,
-				 blockers, dir_addr, addr, dir_port);
+	if (addr == NULL)
+		log_match_len = snprintf(log_match, sizeof(log_match),
+					 log_without_addrport_tmpl, blockers);
+	else
+		log_match_len = snprintf(log_match, sizeof(log_match),
+					 log_with_addrport_tmpl, blockers,
+					 dir_addr, addr, dir_port);
 	if (log_match_len > sizeof(log_match))
 		return -E2BIG;
 
@@ -1924,6 +2182,7 @@ static int matches_log_tcp(const int audit_fd, const char *const blockers,
 FIXTURE(audit)
 {
 	struct service_fixture srv0;
+	struct service_fixture srv1;
 	struct audit_filter audit_filter;
 	int audit_fd;
 };
@@ -1935,7 +2194,7 @@ FIXTURE_VARIANT(audit)
 };
 
 /* clang-format off */
-FIXTURE_VARIANT_ADD(audit, ipv4) {
+FIXTURE_VARIANT_ADD(audit, ipv4_tcp) {
 	/* clang-format on */
 	.addr = "127\\.0\\.0\\.1",
 	.prot = {
@@ -1945,7 +2204,17 @@ FIXTURE_VARIANT_ADD(audit, ipv4) {
 };
 
 /* clang-format off */
-FIXTURE_VARIANT_ADD(audit, ipv6) {
+FIXTURE_VARIANT_ADD(audit, ipv4_udp) {
+	/* clang-format on */
+	.addr = "127\\.0\\.0\\.1",
+	.prot = {
+		.domain = AF_INET,
+		.type = SOCK_DGRAM,
+	},
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(audit, ipv6_tcp) {
 	/* clang-format on */
 	.addr = "::1",
 	.prot = {
@@ -1954,9 +2223,21 @@ FIXTURE_VARIANT_ADD(audit, ipv6) {
 	},
 };
 
+/* clang-format off */
+FIXTURE_VARIANT_ADD(audit, ipv6_udp) {
+	/* clang-format on */
+	.addr = "::1",
+	.prot = {
+		.domain = AF_INET6,
+		.type = SOCK_DGRAM,
+	},
+};
+
 FIXTURE_SETUP(audit)
 {
 	ASSERT_EQ(0, set_service(&self->srv0, variant->prot, 0));
+	ASSERT_EQ(0, set_service(&self->srv1, variant->prot, 1));
+
 	setup_loopback(_metadata);
 
 	set_cap(_metadata, CAP_AUDIT_CONTROL);
@@ -1974,9 +2255,17 @@ FIXTURE_TEARDOWN(audit)
 
 TEST_F(audit, bind)
 {
+	const char *audit_evt = (variant->prot.type == SOCK_STREAM ?
+					 "net\\.bind_tcp" :
+					 "net\\.bind_udp");
+	const int access_rights =
+		(variant->prot.type == SOCK_STREAM ?
+			 LANDLOCK_ACCESS_NET_BIND_TCP |
+				 LANDLOCK_ACCESS_NET_CONNECT_TCP :
+			 LANDLOCK_ACCESS_NET_BIND_UDP |
+				 LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
 	const struct landlock_ruleset_attr ruleset_attr = {
-		.handled_access_net = LANDLOCK_ACCESS_NET_BIND_TCP |
-				      LANDLOCK_ACCESS_NET_CONNECT_TCP,
+		.handled_access_net = access_rights,
 	};
 	struct audit_records records;
 	int ruleset_fd, sock_fd;
@@ -1990,8 +2279,8 @@ TEST_F(audit, bind)
 	sock_fd = socket_variant(&self->srv0);
 	ASSERT_LE(0, sock_fd);
 	EXPECT_EQ(-EACCES, bind_variant(sock_fd, &self->srv0));
-	EXPECT_EQ(0, matches_log_tcp(self->audit_fd, "net\\.bind_tcp", "saddr",
-				     variant->addr, "src"));
+	EXPECT_EQ(0, matches_auditlog(self->audit_fd, audit_evt, "saddr",
+				      variant->addr, "src"));
 
 	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
 	EXPECT_EQ(0, records.access);
@@ -2002,9 +2291,22 @@ TEST_F(audit, bind)
 
 TEST_F(audit, connect)
 {
+	const char *audit_evt = (variant->prot.type == SOCK_STREAM ?
+					 "net\\.connect_tcp" :
+					 "net\\.connect_send_udp");
+	const int bind_right = (variant->prot.type == SOCK_STREAM ?
+					LANDLOCK_ACCESS_NET_BIND_TCP :
+					LANDLOCK_ACCESS_NET_BIND_UDP);
+	const int conn_right = (variant->prot.type == SOCK_STREAM ?
+					LANDLOCK_ACCESS_NET_CONNECT_TCP :
+					LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
+	const int access_rights = bind_right | conn_right;
 	const struct landlock_ruleset_attr ruleset_attr = {
-		.handled_access_net = LANDLOCK_ACCESS_NET_BIND_TCP |
-				      LANDLOCK_ACCESS_NET_CONNECT_TCP,
+		.handled_access_net = access_rights,
+	};
+	const struct landlock_net_port_attr rule_connect_p1 = {
+		.allowed_access = conn_right,
+		.port = self->srv1.port,
 	};
 	struct audit_records records;
 	int ruleset_fd, sock_fd;
@@ -2012,19 +2314,31 @@ TEST_F(audit, connect)
 	ruleset_fd =
 		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
 	ASSERT_LE(0, ruleset_fd);
+	ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
+				       &rule_connect_p1, 0));
 	enforce_ruleset(_metadata, ruleset_fd);
 	EXPECT_EQ(0, close(ruleset_fd));
 
 	sock_fd = socket_variant(&self->srv0);
 	ASSERT_LE(0, sock_fd);
 	EXPECT_EQ(-EACCES, connect_variant(sock_fd, &self->srv0));
-	EXPECT_EQ(0, matches_log_tcp(self->audit_fd, "net\\.connect_tcp",
-				     "daddr", variant->addr, "dest"));
+	EXPECT_EQ(0, matches_auditlog(self->audit_fd, audit_evt, "daddr",
+				      variant->addr, "dest"));
 
 	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
 	EXPECT_EQ(0, records.access);
 	EXPECT_EQ(1, records.domain);
 
+	if (variant->prot.type == SOCK_DGRAM) {
+		/* Check that autobind generates a denied bind event. */
+		EXPECT_EQ(-EACCES, connect_variant(sock_fd, &self->srv1));
+
+		EXPECT_EQ(0, matches_auditlog(self->audit_fd, "net\\.bind_udp",
+					      NULL, NULL, NULL));
+		EXPECT_EQ(0, records.access);
+		EXPECT_EQ(1, records.domain);
+	}
+
 	EXPECT_EQ(0, close(sock_fd));
 }
 
-- 
2.39.5


^ permalink raw reply related

* [PATCH v4 5/7] selftests/landlock: Add tests for sendmsg()
From: Matthieu Buffet @ 2026-05-02 12:43 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	konstantin.meskhidze, Tingmao Wang, netdev, Matthieu Buffet
In-Reply-To: <20260502124306.3975990-1-matthieu@buffet.re>

Add tests specific to UDP sendmsg() in the protocol_* variants to ensure
behaviour is consistent across AF_INET, AF_INET6 and AF_UNIX.

Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
---
 tools/testing/selftests/landlock/net_test.c | 652 +++++++++++++++++++-
 1 file changed, 651 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/landlock/net_test.c b/tools/testing/selftests/landlock/net_test.c
index 568a6ed7139c..2c72fda3c606 100644
--- a/tools/testing/selftests/landlock/net_test.c
+++ b/tools/testing/selftests/landlock/net_test.c
@@ -289,9 +289,163 @@ static int connect_variant(const int sock_fd,
 	return connect_variant_addrlen(sock_fd, srv, get_addrlen(srv, false));
 }
 
+static int sendto_variant_addrlen(const int sock_fd,
+				  const struct service_fixture *const srv,
+				  const socklen_t addrlen, void *buf,
+				  size_t len, size_t flags)
+{
+	const struct sockaddr *dst = NULL;
+	ssize_t ret;
+
+	/*
+	 * We never want our processes to be killed by SIGPIPE: we check
+	 * return codes and errno, so that we have actual error messages.
+	 */
+	flags |= MSG_NOSIGNAL;
+
+	if (srv != NULL) {
+		switch (srv->protocol.domain) {
+		case AF_UNSPEC:
+		case AF_INET:
+			dst = (const struct sockaddr *)&srv->ipv4_addr;
+			break;
+
+		case AF_INET6:
+			dst = (const struct sockaddr *)&srv->ipv6_addr;
+			break;
+
+		case AF_UNIX:
+			dst = (const struct sockaddr *)&srv->unix_addr;
+			break;
+
+		default:
+			errno = EAFNOSUPPORT;
+			return -errno;
+		}
+	}
+
+	ret = sendto(sock_fd, buf, len, flags, dst, addrlen);
+	if (ret < 0)
+		return -errno;
+
+	/* errno is not set in cases of partial writes. */
+	if (ret != len)
+		return -EINTR;
+
+	return 0;
+}
+
+static int sendto_variant(const int sock_fd,
+			  const struct service_fixture *const srv, void *buf,
+			  size_t len, size_t flags)
+{
+	socklen_t addrlen = 0;
+
+	if (srv != NULL)
+		addrlen = get_addrlen(srv, false);
+
+	return sendto_variant_addrlen(sock_fd, srv, addrlen, buf, len, flags);
+}
+
+static int test_sendmsg(struct __test_metadata *const _metadata,
+			const struct protocol_variant *prot, int client_fd,
+			int server_fd, const struct service_fixture *srv,
+			bool bind_denied, bool send_denied)
+{
+	int ret;
+	socklen_t opt_len;
+	int sock_type;
+	int addr_family;
+	struct sockaddr_storage peer_addr = { 0 };
+	bool has_remote_port;
+	bool needs_autobind;
+	char read_buf[1] = { 0 };
+
+	/*
+	 * Prepare the test by inspecting the socket type and whether it
+	 * has a local/remote address set (all of which determine the
+	 * expected outcomes).
+	 */
+	opt_len = sizeof(sock_type);
+	ASSERT_EQ(0, getsockopt(client_fd, SOL_SOCKET, SO_TYPE, &sock_type,
+				&opt_len));
+	opt_len = sizeof(addr_family);
+	ASSERT_EQ(0, getsockopt(client_fd, SOL_SOCKET, SO_DOMAIN, &addr_family,
+				&opt_len));
+	opt_len = sizeof(peer_addr);
+	has_remote_port = (getpeername(client_fd, (struct sockaddr *)&peer_addr,
+				       &opt_len) == 0);
+	needs_autobind = (addr_family == AF_INET || addr_family == AF_INET6) &&
+			 get_binded_port(client_fd, prot) == 0;
+
+	/* First, check error code with truncated explicit address. */
+	if (srv != NULL) {
+		ret = sendto_variant_addrlen(
+			client_fd, srv, get_addrlen(srv, true) - 1, "A", 1, 0);
+		if (sock_type == SOCK_STREAM && !has_remote_port) {
+			EXPECT_EQ(-EPIPE, ret)
+			{
+				return -1;
+			}
+		} else if (bind_denied && needs_autobind) {
+			EXPECT_EQ(-EACCES, ret)
+			{
+				return -1;
+			}
+		} else {
+			EXPECT_EQ(-EINVAL, ret)
+			{
+				return -1;
+			}
+		}
+	}
+
+	/* With or without explicit destination address (srv can be NULL). */
+	ret = sendto_variant(client_fd, srv, "B", 1, 0);
+	if (sock_type == SOCK_STREAM && !has_remote_port) {
+		EXPECT_EQ(-EPIPE, ret)
+		{
+			return -1;
+		}
+	} else if ((send_denied && srv != NULL) ||
+		   (bind_denied && needs_autobind)) {
+		ASSERT_EQ(-EACCES, ret)
+		{
+			return -1;
+		}
+	} else if (srv == NULL && !has_remote_port) {
+		if (addr_family == AF_UNIX) {
+			ASSERT_EQ(-ENOTCONN, ret)
+			{
+				return -1;
+			}
+		} else if (sock_type == SOCK_STREAM) {
+			ASSERT_EQ(-EPIPE, ret)
+			{
+				return -1;
+			}
+		} else {
+			ASSERT_EQ(-EDESTADDRREQ, ret)
+			{
+				return -1;
+			}
+		}
+	} else {
+		ASSERT_EQ(0, ret);
+		ASSERT_EQ(1, recv(server_fd, read_buf, 1, 0));
+		ASSERT_EQ(read_buf[0], 'B')
+		{
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
 FIXTURE(protocol)
 {
-	struct service_fixture srv0, srv1, srv2, unspec_any0, unspec_srv0;
+	struct service_fixture srv0, srv1, srv2;
+	struct service_fixture unspec_any0, unspec_srv0, unspec_srv1;
 };
 
 FIXTURE_VARIANT(protocol)
@@ -313,6 +467,7 @@ FIXTURE_SETUP(protocol)
 	ASSERT_EQ(0, set_service(&self->srv2, variant->prot, 2));
 
 	ASSERT_EQ(0, set_service(&self->unspec_srv0, prot_unspec, 0));
+	ASSERT_EQ(0, set_service(&self->unspec_srv1, prot_unspec, 1));
 
 	ASSERT_EQ(0, set_service(&self->unspec_any0, prot_unspec, 0));
 	self->unspec_any0.ipv4_addr.sin_addr.s_addr = htonl(INADDR_ANY);
@@ -1119,6 +1274,441 @@ TEST_F(protocol, connect_unspec)
 	EXPECT_EQ(0, close(bind_fd));
 }
 
+TEST_F(protocol, sendmsg_stream)
+{
+	int srv0_fd, tmp_fd, client_fd, res;
+	char read_buf[1] = { 0 };
+
+	/*
+	 * Simple test for stream sockets: just deny all connect()/
+	 * send(explicit addr)/bind(), and make sure we don't interfere
+	 * with any operation.
+	 */
+	if (variant->prot.type != SOCK_STREAM)
+		return;
+
+	if (variant->sandbox == UDP_SANDBOX) {
+		const struct landlock_ruleset_attr ruleset_attr = {
+			.handled_access_net =
+				LANDLOCK_ACCESS_NET_BIND_UDP |
+				LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
+		};
+		const int ruleset_fd = landlock_create_ruleset(
+			&ruleset_attr, sizeof(ruleset_attr), 0);
+		ASSERT_LE(0, ruleset_fd);
+		enforce_ruleset(_metadata, ruleset_fd);
+		EXPECT_EQ(0, close(ruleset_fd));
+	}
+
+	ASSERT_LE(0, client_fd = socket_variant(&self->srv0));
+	ASSERT_LE(0, srv0_fd = socket_variant(&self->srv0));
+	ASSERT_EQ(0, bind_variant(srv0_fd, &self->srv0));
+	ASSERT_EQ(0, listen(srv0_fd, backlog));
+
+	/* Send on a non-connected socket. */
+	res = sendto_variant(client_fd, NULL, "A", 1, 0);
+	if (variant->prot.domain == AF_UNIX) {
+		EXPECT_EQ(-ENOTCONN, res);
+	} else {
+		EXPECT_EQ(-EPIPE, res);
+	}
+
+	/* Send to a truncated (invalid) address on a non-connected socket. */
+	res = sendto_variant_addrlen(client_fd, &self->srv0,
+				     get_addrlen(&self->srv0, true) - 1, "B", 1,
+				     0);
+	if (variant->prot.domain == AF_UNIX) {
+		EXPECT_EQ(-EOPNOTSUPP, res);
+	} else {
+		EXPECT_EQ(-EPIPE, res);
+	}
+
+	/* Connect. */
+	ASSERT_EQ(0, connect_variant(client_fd, &self->srv0));
+	tmp_fd = accept(srv0_fd, NULL, 0);
+	ASSERT_LE(0, tmp_fd);
+	EXPECT_EQ(0, close(srv0_fd));
+	srv0_fd = tmp_fd;
+
+	/* Send without an explicit address. */
+	EXPECT_EQ(0, sendto_variant(client_fd, NULL, "C", 1, 0));
+	EXPECT_EQ(1, recv(srv0_fd, read_buf, 1, 0))
+	{
+		TH_LOG("recv() failed: %s", strerror(errno));
+	}
+	EXPECT_EQ(read_buf[0], 'C');
+
+	/* Send to a truncated (invalid) address. */
+	res = sendto_variant_addrlen(client_fd, &self->srv0,
+				     get_addrlen(&self->srv0, true) - 1, "D", 1,
+				     0);
+	if (variant->prot.domain == AF_UNIX) {
+		EXPECT_EQ(-EISCONN, res);
+	} else {
+		EXPECT_EQ(0, res);
+		EXPECT_EQ(1, recv(srv0_fd, read_buf, 1, 0))
+		{
+			TH_LOG("recv() failed: %s", strerror(errno));
+		}
+		EXPECT_EQ(read_buf[0], 'D');
+	}
+
+	/* Send to a valid but different address. */
+	res = sendto_variant(client_fd, &self->srv1, "E", 1, 0);
+	if (variant->prot.domain == AF_UNIX) {
+		EXPECT_EQ(-EISCONN, res);
+	} else {
+		EXPECT_EQ(0, res);
+		EXPECT_EQ(1, recv(srv0_fd, read_buf, 1, 0))
+		{
+			TH_LOG("recv() failed: %s", strerror(errno));
+		}
+		EXPECT_EQ(read_buf[0], 'E');
+	}
+
+	EXPECT_EQ(0, close(client_fd));
+}
+
+TEST_F(protocol, sendmsg_dgram)
+{
+	const bool restricted = is_restricted(&variant->prot, variant->sandbox);
+	int srv0_fd, srv1_fd, client_fd, child, status, res;
+
+	if (variant->prot.type != SOCK_DGRAM)
+		return;
+
+	/* Prepare server on port #0 to be allowed. */
+	ASSERT_LE(0, srv0_fd = socket_variant(&self->srv0));
+	ASSERT_EQ(0, bind_variant(srv0_fd, &self->srv0));
+
+	/* And another server on port #1 to be denied. */
+	ASSERT_LE(0, srv1_fd = socket_variant(&self->srv1));
+	ASSERT_EQ(0, bind_variant(srv1_fd, &self->srv1));
+
+	/*
+	 * Check that sockets connected before restrictions are not
+	 * impacted in any way.
+	 */
+	child = fork();
+	ASSERT_LE(0, child);
+	if (child == 0) {
+		ASSERT_LE(0, client_fd = socket_variant(&self->srv0));
+		ASSERT_EQ(0, connect_variant(client_fd, &self->srv0));
+		if (variant->sandbox == UDP_SANDBOX) {
+			/* Deny all connect()/send(explicit addr)/bind(). */
+			const struct landlock_ruleset_attr ruleset_attr = {
+				.handled_access_net =
+					LANDLOCK_ACCESS_NET_BIND_UDP |
+					LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
+			};
+			const int ruleset_fd = landlock_create_ruleset(
+				&ruleset_attr, sizeof(ruleset_attr), 0);
+			ASSERT_LE(0, ruleset_fd);
+			enforce_ruleset(_metadata, ruleset_fd);
+			EXPECT_EQ(0, close(ruleset_fd));
+		}
+		EXPECT_EQ(0,
+			  test_sendmsg(_metadata, &variant->prot, client_fd,
+				       srv0_fd, NULL, restricted, restricted));
+		EXPECT_EQ(0, test_sendmsg(_metadata, &variant->prot, client_fd,
+					  srv0_fd, &self->srv0, restricted,
+					  restricted));
+		EXPECT_EQ(0, test_sendmsg(_metadata, &variant->prot, client_fd,
+					  srv1_fd, &self->srv1, restricted,
+					  restricted));
+		EXPECT_EQ(0, close(client_fd));
+		_exit(_metadata->exit_code);
+	}
+	EXPECT_EQ(child, waitpid(child, &status, 0));
+	EXPECT_EQ(1, WIFEXITED(status));
+	EXPECT_EQ(EXIT_SUCCESS, WEXITSTATUS(status));
+
+	/*
+	 * Restrict connect/send, but not bind(). Then try sending with
+	 * no destination (and no remote peer set), an allowed
+	 * destination, then a denied destination.
+	 */
+	child = fork();
+	ASSERT_LE(0, child);
+	if (child == 0) {
+		if (variant->sandbox == UDP_SANDBOX) {
+			const struct landlock_ruleset_attr ruleset_attr = {
+				.handled_access_net =
+					LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
+			};
+			const struct landlock_net_port_attr send_p0 = {
+				.allowed_access =
+					LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
+				.port = self->srv0.port,
+			};
+			const int ruleset_fd = landlock_create_ruleset(
+				&ruleset_attr, sizeof(ruleset_attr), 0);
+			ASSERT_LE(0, ruleset_fd);
+			ASSERT_EQ(0, landlock_add_rule(ruleset_fd,
+						       LANDLOCK_RULE_NET_PORT,
+						       &send_p0, 0));
+			enforce_ruleset(_metadata, ruleset_fd);
+			EXPECT_EQ(0, close(ruleset_fd));
+		}
+		ASSERT_LE(0, client_fd = socket_variant(&self->srv0));
+		EXPECT_EQ(0, test_sendmsg(_metadata, &variant->prot, client_fd,
+					  -1, NULL, false, false));
+		EXPECT_EQ(0, test_sendmsg(_metadata, &variant->prot, client_fd,
+					  srv0_fd, &self->srv0, false, false));
+		EXPECT_EQ(0, test_sendmsg(_metadata, &variant->prot, client_fd,
+					  srv1_fd, &self->srv1, false,
+					  restricted));
+		EXPECT_EQ(0, close(client_fd));
+		_exit(_metadata->exit_code);
+		return;
+	}
+	EXPECT_EQ(child, waitpid(child, &status, 0));
+	EXPECT_EQ(1, WIFEXITED(status));
+	EXPECT_EQ(EXIT_SUCCESS, WEXITSTATUS(status));
+
+	/*
+	 * Rest of this test is just for autobind enforcement, which only
+	 * exists in IP sockets.
+	 */
+	if (variant->prot.domain != AF_INET && variant->prot.domain != AF_INET6)
+		return;
+
+	/* Restrict bind() to explicit calls with an arbitrary (non-0) port. */
+	child = fork();
+	ASSERT_LE(0, child);
+	if (child == 0) {
+		const uint16_t allowed_src_port = 42424;
+		struct service_fixture allowed_src;
+
+		allowed_src = self->srv0;
+		set_port(&allowed_src, allowed_src_port);
+		if (variant->sandbox == UDP_SANDBOX) {
+			const struct landlock_ruleset_attr ruleset_attr = {
+				.handled_access_net =
+					LANDLOCK_ACCESS_NET_BIND_UDP,
+			};
+			const struct landlock_net_port_attr rule = {
+				.allowed_access = LANDLOCK_ACCESS_NET_BIND_UDP,
+				.port = allowed_src_port,
+			};
+			const int ruleset_fd = landlock_create_ruleset(
+				&ruleset_attr, sizeof(ruleset_attr), 0);
+			ASSERT_LE(0, ruleset_fd);
+			ASSERT_EQ(0, landlock_add_rule(ruleset_fd,
+						       LANDLOCK_RULE_NET_PORT,
+						       &rule, 0));
+			enforce_ruleset(_metadata, ruleset_fd);
+			EXPECT_EQ(0, close(ruleset_fd));
+		}
+		ASSERT_LE(0, client_fd = socket_variant(&self->srv0));
+
+		/* Check that implicit bind(0) in sendmsg() is denied. */
+		EXPECT_EQ(0, test_sendmsg(_metadata, &variant->prot, client_fd,
+					  srv0_fd, &self->srv0, restricted,
+					  false));
+
+		/* Same thing for autobind in connect(). */
+		res = connect_variant(client_fd, &self->srv0);
+		if (restricted) {
+			EXPECT_EQ(-EACCES, res);
+		} else {
+			EXPECT_EQ(0, res);
+		}
+		EXPECT_EQ(0, close(client_fd));
+
+		/* Make sendmsg() work by explicitly binding to the only allowed port. */
+		ASSERT_LE(0, client_fd = socket_variant(&self->srv0));
+		EXPECT_EQ(0, bind_variant(client_fd, &allowed_src));
+		EXPECT_EQ(0, test_sendmsg(_metadata, &variant->prot, client_fd,
+					  srv0_fd, &self->srv0, restricted,
+					  false));
+		EXPECT_EQ(0, close(client_fd));
+
+		/* Make connect() work by explicitly binding to the only allowed port. */
+		ASSERT_LE(0, client_fd = socket_variant(&self->srv0));
+		EXPECT_EQ(0, bind_variant(client_fd, &allowed_src));
+		EXPECT_EQ(0, connect_variant(client_fd, &self->srv0));
+		EXPECT_EQ(0, close(client_fd));
+
+		_exit(_metadata->exit_code);
+		return;
+	}
+	EXPECT_EQ(child, waitpid(child, &status, 0));
+	EXPECT_EQ(1, WIFEXITED(status));
+	EXPECT_EQ(EXIT_SUCCESS, WEXITSTATUS(status));
+
+	/*
+	 * Check that %LANDLOCK_ACCESS_NET_BIND_UDP on port 0 allows
+	 * implicit autobinds.
+	 */
+	child = fork();
+	ASSERT_LE(0, child);
+	if (child == 0) {
+		if (variant->sandbox == UDP_SANDBOX) {
+			const struct landlock_ruleset_attr ruleset_attr = {
+				.handled_access_net =
+					LANDLOCK_ACCESS_NET_BIND_UDP,
+			};
+			const struct landlock_net_port_attr rule = {
+				.allowed_access = LANDLOCK_ACCESS_NET_BIND_UDP,
+				.port = 0,
+			};
+			const int ruleset_fd = landlock_create_ruleset(
+				&ruleset_attr, sizeof(ruleset_attr), 0);
+			ASSERT_LE(0, ruleset_fd);
+			ASSERT_EQ(0, landlock_add_rule(ruleset_fd,
+						       LANDLOCK_RULE_NET_PORT,
+						       &rule, 0));
+			enforce_ruleset(_metadata, ruleset_fd);
+			EXPECT_EQ(0, close(ruleset_fd));
+		}
+		ASSERT_LE(0, client_fd = socket_variant(&self->srv0));
+		EXPECT_EQ(0, test_sendmsg(_metadata, &variant->prot, client_fd,
+					  srv0_fd, &self->srv0, false, false));
+		EXPECT_EQ(0, close(client_fd));
+		_exit(_metadata->exit_code);
+	}
+	EXPECT_EQ(child, waitpid(child, &status, 0));
+	EXPECT_EQ(1, WIFEXITED(status));
+	EXPECT_EQ(EXIT_SUCCESS, WEXITSTATUS(status));
+}
+
+TEST_F(protocol, sendmsg_unspec)
+{
+	const bool restricted = is_restricted(&variant->prot, variant->sandbox);
+	int client_fd, srv0_fd, srv1_fd, res;
+	char read_buf[1] = { 0 };
+
+	/*
+	 * We already test for the absence of influence on sendmsg for
+	 * other socket types and other address families, there's no
+	 * point in adapting this test for stream sockets too.
+	 */
+	if (variant->prot.type != SOCK_DGRAM)
+		return;
+
+	/* Prepare client of the right family. */
+	ASSERT_LE(0, client_fd = socket_variant(&self->srv0));
+
+	/* Prepare server on port #0 to be allowed. */
+	ASSERT_LE(0, srv0_fd = socket_variant(&self->srv0));
+	ASSERT_EQ(0, bind_variant(srv0_fd, &self->srv0));
+
+	/* And another server on port #1 to be denied. */
+	ASSERT_LE(0, srv1_fd = socket_variant(&self->srv1));
+	ASSERT_EQ(0, bind_variant(srv1_fd, &self->srv1));
+
+	if (variant->sandbox == UDP_SANDBOX) {
+		const struct landlock_ruleset_attr ruleset_attr = {
+			.handled_access_net =
+				LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
+		};
+		const struct landlock_net_port_attr rule = {
+			.allowed_access = LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
+			.port = self->srv0.port,
+		};
+		const int ruleset_fd = landlock_create_ruleset(
+			&ruleset_attr, sizeof(ruleset_attr), 0);
+		ASSERT_LE(0, ruleset_fd);
+		ASSERT_EQ(0,
+			  landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
+					    &rule, 0));
+		enforce_ruleset(_metadata, ruleset_fd);
+		EXPECT_EQ(0, close(ruleset_fd));
+	}
+
+	/* Explicit AF_UNSPEC address but truncated. */
+	EXPECT_EQ(-EINVAL, sendto_variant_addrlen(
+				   client_fd, &self->unspec_srv0,
+				   get_addrlen(&self->unspec_srv0, true) - 1,
+				   "A", 1, 0));
+
+	/*
+	 * Explicit AF_UNSPEC address, should be treated as AF_INET by
+	 * IPv4 sockets (and thus map to srv0, allowed), but be denied by
+	 * IPv6 sockets.
+	 */
+	res = sendto_variant(client_fd, &self->unspec_srv0, "B", 1, 0);
+	if (variant->prot.domain == AF_INET6) {
+		if (restricted) {
+			/* Always denied on IPv6 socket. */
+			EXPECT_EQ(-EACCES, res);
+		} else {
+			/* IPv6 sockets treat AF_UNSPEC as a NULL address. */
+			EXPECT_EQ(-EDESTADDRREQ, res);
+		}
+	} else if (variant->prot.domain == AF_INET) {
+		EXPECT_EQ(0, res);
+		EXPECT_EQ(1, read(srv0_fd, read_buf, 1))
+		{
+			TH_LOG("read() failed: %s", strerror(errno));
+		}
+		EXPECT_EQ(read_buf[0], 'B');
+	} else {
+		/* Unix sockets don't accept AF_UNSPEC. */
+		EXPECT_EQ(-EINVAL, res);
+	}
+
+	/*
+	 * Explicit AF_UNSPEC address, should be treated as AF_INET on
+	 * IPv4 sockets (and thus map to srv1, denied), and be denied
+	 * on IPv6 sockets as always.
+	 */
+	res = sendto_variant(client_fd, &self->unspec_srv1, "C", 1, 0);
+	if (variant->prot.domain == AF_INET6) {
+		if (restricted) {
+			/* Always denied on IPv6 socket. */
+			EXPECT_EQ(-EACCES, res);
+		} else {
+			/* IPv6 sockets treat AF_UNSPEC as a NULL address. */
+			EXPECT_EQ(-EDESTADDRREQ, res);
+		}
+	} else if (variant->prot.domain == AF_INET) {
+		if (restricted) {
+			/* Sending to srv1 is not allowed, only srv0. */
+			EXPECT_EQ(-EACCES, res);
+		} else {
+			EXPECT_EQ(0, res);
+			EXPECT_EQ(1, read(srv1_fd, read_buf, 1))
+			{
+				TH_LOG("read() failed: %s", strerror(errno));
+			}
+			EXPECT_EQ(read_buf[0], 'C');
+		}
+	} else {
+		/* Unix sockets don't accept AF_UNSPEC. */
+		EXPECT_EQ(-EINVAL, res);
+	}
+
+	ASSERT_EQ(0, connect_variant(client_fd, &self->srv0));
+
+	/* Minimal explicit AF_UNSPEC address (just the sa_family_t field) */
+	res = sendto_variant_addrlen(client_fd, &self->unspec_srv0,
+				     get_addrlen(&self->unspec_srv0, true), "D",
+				     1, 0);
+	if (variant->prot.domain == AF_INET6) {
+		if (restricted) {
+			/* AF_UNSPEC is always denied in IPv6. */
+			EXPECT_EQ(-EACCES, res);
+		} else {
+			/*
+			 * IPv6 sockets treat AF_UNSPEC as a NULL address,
+			 * falling back to the connected address.
+			 */
+			EXPECT_EQ(0, res);
+			EXPECT_EQ(1, read(srv0_fd, read_buf, 1));
+			EXPECT_EQ(read_buf[0], 'D');
+		}
+	} else {
+		/*
+		 * IPv4 socket will expect a struct sockaddr_in, our address
+		 * is considered truncated.
+		 * And Unix sockets don't accept AF_UNSPEC at all.
+		 */
+		EXPECT_EQ(-EINVAL, res);
+	}
+}
+
 FIXTURE(ipv4)
 {
 	struct service_fixture srv0, srv1;
@@ -2183,6 +2773,7 @@ FIXTURE(audit)
 {
 	struct service_fixture srv0;
 	struct service_fixture srv1;
+	struct service_fixture unspec_srv0;
 	struct audit_filter audit_filter;
 	int audit_fd;
 };
@@ -2235,8 +2826,13 @@ FIXTURE_VARIANT_ADD(audit, ipv6_udp) {
 
 FIXTURE_SETUP(audit)
 {
+	struct protocol_variant prot_unspec = variant->prot;
+
+	prot_unspec.domain = AF_UNSPEC;
+
 	ASSERT_EQ(0, set_service(&self->srv0, variant->prot, 0));
 	ASSERT_EQ(0, set_service(&self->srv1, variant->prot, 1));
+	ASSERT_EQ(0, set_service(&self->unspec_srv0, prot_unspec, 0));
 
 	setup_loopback(_metadata);
 
@@ -2342,4 +2938,58 @@ TEST_F(audit, connect)
 	EXPECT_EQ(0, close(sock_fd));
 }
 
+TEST_F(audit, sendmsg)
+{
+	const struct landlock_ruleset_attr ruleset_attr = {
+		.handled_access_net = LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP |
+				      LANDLOCK_ACCESS_NET_BIND_UDP,
+	};
+	const struct landlock_net_port_attr rule = {
+		.allowed_access = LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
+		.port = self->srv1.port,
+	};
+	const int ruleset_fd =
+		landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+	struct audit_records records;
+	int sock_fd;
+
+	/* Sendmsg on stream sockets is never denied. */
+	if (variant->prot.type != SOCK_DGRAM)
+		return;
+
+	ASSERT_LE(0, ruleset_fd);
+	ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
+				       &rule, 0));
+	enforce_ruleset(_metadata, ruleset_fd);
+	EXPECT_EQ(0, close(ruleset_fd));
+
+	sock_fd = socket_variant(&self->srv0);
+	ASSERT_LE(0, sock_fd);
+	EXPECT_EQ(-EACCES, sendto_variant(sock_fd, &self->srv0, "A", 1, 0));
+	EXPECT_EQ(0, matches_auditlog(self->audit_fd, "net\\.connect_send_udp",
+				      "daddr", variant->addr, "dest"));
+
+	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+	EXPECT_EQ(0, records.access);
+	EXPECT_EQ(1, records.domain);
+
+	/* Check that autobind generates a denied bind event. */
+	EXPECT_EQ(-EACCES, sendto_variant(sock_fd, &self->srv1, "A", 1, 0));
+	EXPECT_EQ(0, matches_auditlog(self->audit_fd, "net\\.bind_udp", NULL,
+				      NULL, NULL));
+	EXPECT_EQ(0, records.access);
+	EXPECT_EQ(1, records.domain);
+
+	EXPECT_EQ(-EACCES,
+		  sendto_variant(sock_fd, &self->unspec_srv0, "B", 1, 0));
+	EXPECT_EQ(0, matches_auditlog(self->audit_fd, "net\\.connect_send_udp",
+				      "daddr", NULL, "dest"));
+
+	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
+	EXPECT_EQ(0, records.access);
+	EXPECT_EQ(0, records.domain);
+
+	EXPECT_EQ(0, close(sock_fd));
+}
+
 TEST_HARNESS_MAIN
-- 
2.39.5


^ permalink raw reply related

* [PATCH v4 6/7] samples/landlock: Add sandboxer UDP access control
From: Matthieu Buffet @ 2026-05-02 12:43 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	konstantin.meskhidze, Tingmao Wang, netdev, Matthieu Buffet
In-Reply-To: <20260502124306.3975990-1-matthieu@buffet.re>

Add environment variables to control associated access rights:
- LL_UDP_BIND
- LL_UDP_CONNECT_SEND

Each one takes a list of ports separated by colons, like other list
options.

Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
---
 samples/landlock/sandboxer.c | 40 ++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
index 66e56ae275c6..94e399e6b146 100644
--- a/samples/landlock/sandboxer.c
+++ b/samples/landlock/sandboxer.c
@@ -62,6 +62,8 @@ static inline int landlock_restrict_self(const int ruleset_fd,
 #define ENV_TCP_CONNECT_NAME "LL_TCP_CONNECT"
 #define ENV_SCOPED_NAME "LL_SCOPED"
 #define ENV_FORCE_LOG_NAME "LL_FORCE_LOG"
+#define ENV_UDP_BIND_NAME "LL_UDP_BIND"
+#define ENV_UDP_CONNECT_SEND_NAME "LL_UDP_CONNECT_SEND"
 #define ENV_DELIMITER ":"
 
 static int str2num(const char *numstr, __u64 *num_dst)
@@ -301,7 +303,7 @@ static bool check_ruleset_scope(const char *const env_var,
 
 /* clang-format on */
 
-#define LANDLOCK_ABI_LAST 9
+#define LANDLOCK_ABI_LAST 10
 
 #define XSTR(s) #s
 #define STR(s) XSTR(s)
@@ -324,6 +326,10 @@ static const char help[] =
 	"means an empty list):\n"
 	"* " ENV_TCP_BIND_NAME ": ports allowed to bind (server)\n"
 	"* " ENV_TCP_CONNECT_NAME ": ports allowed to connect (client)\n"
+	"* " ENV_UDP_BIND_NAME ": local UDP ports allowed to bind (server: "
+	"prepare to receive on port / client: set as source port)\n"
+	"* " ENV_UDP_CONNECT_SEND_NAME ": remote UDP ports allowed to connect "
+	"or sendmsg (client: use as destination port / server: receive only from it)\n"
 	"* " ENV_SCOPED_NAME ": actions denied on the outside of the landlock domain\n"
 	"  - \"a\" to restrict opening abstract unix sockets\n"
 	"  - \"s\" to restrict sending signals\n"
@@ -336,6 +342,7 @@ static const char help[] =
 	ENV_FS_RW_NAME "=\"/dev/null:/dev/full:/dev/zero:/dev/pts:/tmp\" "
 	ENV_TCP_BIND_NAME "=\"9418\" "
 	ENV_TCP_CONNECT_NAME "=\"80:443\" "
+	ENV_UDP_CONNECT_SEND_NAME "=\"53\" "
 	ENV_SCOPED_NAME "=\"a:s\" "
 	"%1$s bash -i\n"
 	"\n"
@@ -356,7 +363,9 @@ int main(const int argc, char *const argv[], char *const *const envp)
 	struct landlock_ruleset_attr ruleset_attr = {
 		.handled_access_fs = access_fs_rw,
 		.handled_access_net = LANDLOCK_ACCESS_NET_BIND_TCP |
-				      LANDLOCK_ACCESS_NET_CONNECT_TCP,
+				      LANDLOCK_ACCESS_NET_CONNECT_TCP |
+				      LANDLOCK_ACCESS_NET_BIND_UDP |
+				      LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
 		.scoped = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
 			  LANDLOCK_SCOPE_SIGNAL,
 	};
@@ -444,6 +453,13 @@ int main(const int argc, char *const argv[], char *const *const envp)
 		/* Removes LANDLOCK_ACCESS_FS_RESOLVE_UNIX for ABI < 9 */
 		ruleset_attr.handled_access_fs &=
 			~LANDLOCK_ACCESS_FS_RESOLVE_UNIX;
+		__attribute__((fallthrough));
+	case 9:
+		/* Removes UDP support for ABI < 10 */
+		ruleset_attr.handled_access_net &=
+			~(LANDLOCK_ACCESS_NET_BIND_UDP |
+			  LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
+
 		/* Must be printed for any ABI < LANDLOCK_ABI_LAST. */
 		fprintf(stderr,
 			"Hint: You should update the running kernel "
@@ -475,6 +491,18 @@ int main(const int argc, char *const argv[], char *const *const envp)
 		ruleset_attr.handled_access_net &=
 			~LANDLOCK_ACCESS_NET_CONNECT_TCP;
 	}
+	/* Removes UDP bind access control if not supported by a user. */
+	env_port_name = getenv(ENV_UDP_BIND_NAME);
+	if (!env_port_name) {
+		ruleset_attr.handled_access_net &=
+			~LANDLOCK_ACCESS_NET_BIND_UDP;
+	}
+	/* Removes UDP connect/send access control if not supported by a user. */
+	env_port_name = getenv(ENV_UDP_CONNECT_SEND_NAME);
+	if (!env_port_name) {
+		ruleset_attr.handled_access_net &=
+			~LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP;
+	}
 
 	if (check_ruleset_scope(ENV_SCOPED_NAME, &ruleset_attr))
 		return 1;
@@ -519,6 +547,14 @@ int main(const int argc, char *const argv[], char *const *const envp)
 				 LANDLOCK_ACCESS_NET_CONNECT_TCP)) {
 		goto err_close_ruleset;
 	}
+	if (populate_ruleset_net(ENV_UDP_BIND_NAME, ruleset_fd,
+				 LANDLOCK_ACCESS_NET_BIND_UDP)) {
+		goto err_close_ruleset;
+	}
+	if (populate_ruleset_net(ENV_UDP_CONNECT_SEND_NAME, ruleset_fd,
+				 LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP)) {
+		goto err_close_ruleset;
+	}
 
 	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
 		perror("Failed to restrict privileges");
-- 
2.39.5


^ permalink raw reply related

* [PATCH v4 7/7] landlock: Add documentation for UDP support
From: Matthieu Buffet @ 2026-05-02 12:43 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	konstantin.meskhidze, Tingmao Wang, netdev, Matthieu Buffet
In-Reply-To: <20260502124306.3975990-1-matthieu@buffet.re>

Add example of UDP usage, without detailing the two access right.
Slightly change the example used in code blocks: build a ruleset for a
DNS client, so that it uses both TCP and UDP.

Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
---
 Documentation/userspace-api/landlock.rst | 89 ++++++++++++++++++------
 1 file changed, 68 insertions(+), 21 deletions(-)

diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
index fd8b78c31f2f..9d5da9896628 100644
--- a/Documentation/userspace-api/landlock.rst
+++ b/Documentation/userspace-api/landlock.rst
@@ -40,8 +40,8 @@ Filesystem rules
     and the related filesystem actions are defined with
     `filesystem access rights`.
 
-Network rules (since ABI v4)
-    For these rules, the object is a TCP port,
+Network rules (since ABI v4 for TCP and v10 for UDP)
+    For these rules, the object is a TCP or UDP port,
     and the related actions are defined with `network access rights`.
 
 Defining and enforcing a security policy
@@ -49,11 +49,11 @@ Defining and enforcing a security policy
 
 We first need to define the ruleset that will contain our rules.
 
-For this example, the ruleset will contain rules that only allow filesystem
-read actions and establish a specific TCP connection. Filesystem write
-actions and other TCP actions will be denied.
+For this example, the ruleset will contain rules that only allow some
+filesystem read actions and some specific UDP and TCP actions. Filesystem
+write actions and other TCP/UDP actions will be denied.
 
-The ruleset then needs to handle both these kinds of actions.  This is
+The ruleset then needs to handle all these kinds of actions.  This is
 required for backward and forward compatibility (i.e. the kernel and user
 space may not know each other's supported restrictions), hence the need
 to be explicit about the denied-by-default access rights.
@@ -81,7 +81,9 @@ to be explicit about the denied-by-default access rights.
             LANDLOCK_ACCESS_FS_RESOLVE_UNIX,
         .handled_access_net =
             LANDLOCK_ACCESS_NET_BIND_TCP |
-            LANDLOCK_ACCESS_NET_CONNECT_TCP,
+            LANDLOCK_ACCESS_NET_CONNECT_TCP |
+            LANDLOCK_ACCESS_NET_BIND_UDP |
+            LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
         .scoped =
             LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
             LANDLOCK_SCOPE_SIGNAL,
@@ -132,6 +134,12 @@ version, and only use the available subset of access rights:
     case 6 ... 8:
         /* Removes LANDLOCK_ACCESS_FS_RESOLVE_UNIX for ABI < 9 */
         ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_RESOLVE_UNIX;
+        __attribute__((fallthrough));
+    case 9:
+        /* Removes LANDLOCK_ACCESS_*_UDP for ABI < 10 */
+        ruleset_attr.handled_access_net &=
+            ~(LANDLOCK_ACCESS_NET_BIND_UDP |
+              LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP);
     }
 
 This enables the creation of an inclusive ruleset that will contain our rules.
@@ -180,21 +188,50 @@ this file descriptor.
 
 It may also be required to create rules following the same logic as explained
 for the ruleset creation, by filtering access rights according to the Landlock
-ABI version.  In this example, this is not required because all of the requested
-``allowed_access`` rights are already available in ABI 1.
+ABI version.  So far, this was not required because all of the requested
+``allowed_access`` rights have always been available, from ABI 1.
 
-For network access-control, we can add a set of rules that allow to use a port
-number for a specific action: HTTPS connections.
+For network access-control, we will add a set of rules to allow DNS
+queries, which requires both UDP and TCP. For TCP, we need to allow
+outbound connections to port 53, which can be handled and granted starting
+with ABI 4:
 
 .. code-block:: c
 
-    struct landlock_net_port_attr net_port = {
-        .allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP,
-        .port = 443,
-    };
+    if (ruleset_attr.handled_access_net & LANDLOCK_ACCESS_NET_CONNECT_TCP) {
+        struct landlock_net_port_attr net_port = {
+            .allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP,
+            .port = 53,
+        };
 
-    err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
-                            &net_port, 0);
+        err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
+                                &net_port, 0);
+
+We also need to be able to send UDP datagrams to port 53, which requires
+granting ``LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP``. Since our DNS client will
+emit datagrams without explicitly binding to a specific source port, its UDP
+socket will automatically bind an ephemeral port. To allow this behaviour,
+we also need to grant ``LANDLOCK_ACCESS_NET_BIND_UDP`` on port 0, as if
+the program explicitly called :manpage:`bind(2)` on port 0.
+
+.. code-block:: c
+
+    if (ruleset_attr.handled_access_net & LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP) {
+        const struct landlock_net_port_attr send_dst_port = {
+            .allowed_access = LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP,
+            .port = 53,
+        };
+        err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
+                                &send_dst_port, 0);
+        [...]
+
+    if (ruleset_attr.handled_access_net & LANDLOCK_ACCESS_NET_BIND_UDP) {
+        const struct landlock_net_port_attr bind_src_port = {
+            .allowed_access = LANDLOCK_ACCESS_NET_BIND_UDP,
+            .port = 0,
+        };
+        err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
+                                &bind_src_port, 0);
 
 When passing a non-zero ``flags`` argument to ``landlock_restrict_self()``, a
 similar backwards compatibility check is needed for the restrict flags
@@ -228,7 +265,7 @@ similar backwards compatibility check is needed for the restrict flags
 The next step is to restrict the current thread from gaining more privileges
 (e.g. through a SUID binary).  We now have a ruleset with the first rule
 allowing read and execute access to ``/usr`` while denying all other handled
-accesses for the filesystem, and a second rule allowing HTTPS connections.
+accesses for the filesystem, and two more rules allowing DNS queries.
 
 .. code-block:: c
 
@@ -716,6 +753,16 @@ Starting with the Landlock ABI version 9, it is possible to restrict
 connections to pathname UNIX domain sockets (:manpage:`unix(7)`) using
 the new ``LANDLOCK_ACCESS_FS_RESOLVE_UNIX`` right.
 
+UDP bind, connect, sendto, sendmsg and sendmmsg (ABI < 10)
+----------------------------------------------------------
+
+Starting with the Landlock ABI version 10, it is possible to restrict
+setting the local port of UDP sockets with the
+``LANDLOCK_ACCESS_NET_BIND_UDP`` right.
+The ``LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP`` right controls setting the
+remote port of UDP sockets, and sending datagrams to an explicit remote
+port (ignoring any destination set on UDP sockets).
+
 .. _kernel_support:
 
 Kernel support
@@ -778,10 +825,10 @@ the boot loader.
 Network support
 ---------------
 
-To be able to explicitly allow TCP operations (e.g., adding a network rule with
-``LANDLOCK_ACCESS_NET_BIND_TCP``), the kernel must support TCP
+To be able to explicitly allow TCP or UDP operations (e.g., adding a network rule with
+``LANDLOCK_ACCESS_NET_BIND_TCP``), the kernel must support the TCP/IP protocol suite
 (``CONFIG_INET=y``).  Otherwise, sys_landlock_add_rule() returns an
-``EAFNOSUPPORT`` error, which can safely be ignored because this kind of TCP
+``EAFNOSUPPORT`` error, which can safely be ignored because this kind of TCP or UDP
 operation is already not possible.
 
 Questions and answers
-- 
2.39.5


^ permalink raw reply related

* [PATCH v4 0/7] landlock: Add UDP access control support
From: Matthieu Buffet @ 2026-05-02 12:42 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	konstantin.meskhidze, Tingmao Wang, netdev, Matthieu Buffet

Hi,

This is V4 of UDP access control in Landlock. Thanks to the round of
review of v3, access rights have changed to something that seems easier
to use and understand. It adds only two access rights, to restrict
configuring local and remote addresses on UDP sockets. The one that
restricts setting a remote address also controls sending datagrams to
explicit remote addresses -ignoring any remote address preset on the
socket-. The one that restricts binding to a local port also applies
when the kernel auto-binds an ephemeral port.
v1:
Link: https://lore.kernel.org/all/20240916122230.114800-1-matthieu@buffet.re/
v2:
Link: https://lore.kernel.org/all/20241214184540.3835222-1-matthieu@buffet.re/
v3:
Link: https://lore.kernel.org/all/20251212163704.142301-1-matthieu@buffet.re/

The limitation around allowing a process to send but not receive is
still there, and could warrant another patch if there is a real user
need.
I'm just not super happy about the clarity of logs generated for denied
autobinds ("domain=xxxxxx blockers=net.bind_udp"), due to the fact that
addresses and ports are currently only logged if they are non-0. A later
(coordinated LSM-wide) patch could improve readability by replacing != 0
checks with new booleans in struct lsm_network_audit. I'm also not
exactly happy with the integration in existing TCP selftests, but
refactoring them has already been discussed earlier.

Changes v1->v2
==============
- recvmsg hook is gone and sendmsg hook doesn't apply when sending to a
  remote address pre-set on socket, to improve performance
- don't add a get_addr_port() helper function, which required a weird
  "am I in IPv4 or IPv6 context"
- reorder hook prologue for consistency: check domain, then type and
  family

Changes v2->v3
==============
- removed support for sending datagrams with explicit destination
  address of family AF_UNSPEC, which allowed to bypass restrictions with
  a race condition
- rebased on linux-mic/next => add support for auditing
- fixed mistake in selftests when using unspec_srv variables, which were
  implicitly of type SOCK_STREAM and did not actually test UDP code
- add tests for IPPROTO_IP
- improved docs, split off TCP-related refactoring

Changes v3->v4
==============
- merge LANDLOCK_ACCESS_NET_CONNECT_UDP and
  LANDLOCK_ACCESS_NET_SENDTO_UDP into
  LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP (everything that might set the
  destination of a datagram)
- make LANDLOCK_ACCESS_NET_BIND_UDP apply when kernel is about to
  auto-bind an ephemeral port for the caller. Block it if policy would
  not allow an explicit call to bind(0)
- only deny sending AF_UNSPEC datagrams on IPv6 sockets, where there is
  a risk of the address family changing midway

Patch is based on https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git
3457a5ccacd3 ("landlock: Document fallocate(2) as another truncation corner case")
All lines added are covered with selftests, except the "default: return
0" in current_check_autobind_udp_socket() which is not currently
reachable (net.c goes from 92.9%->94.6% line coverage).

Let me know what you think!

Closes: https://github.com/landlock-lsm/linux/issues/10

Matthieu Buffet (7):
  landlock: Add UDP bind() access control
  landlock: Add UDP connect() access control
  landlock: Add UDP send access control
  selftests/landlock: Add UDP bind/connect tests
  selftests/landlock: Add tests for sendmsg()
  samples/landlock: Add sandboxer UDP access control
  landlock: Add documentation for UDP support

 Documentation/userspace-api/landlock.rst     |   89 +-
 include/uapi/linux/landlock.h                |   35 +-
 samples/landlock/sandboxer.c                 |   40 +-
 security/landlock/audit.c                    |    3 +
 security/landlock/limits.h                   |    2 +-
 security/landlock/net.c                      |  161 ++-
 security/landlock/syscalls.c                 |    2 +-
 tools/testing/selftests/landlock/base_test.c |    4 +-
 tools/testing/selftests/landlock/net_test.c  | 1146 ++++++++++++++++--
 9 files changed, 1341 insertions(+), 141 deletions(-)


base-commit: 3457a5ccacd34fdd5ebd3a4745e721b5a1239690
-- 
2.39.5


^ permalink raw reply

* [PATCH v4 1/7] landlock: Add UDP bind() access control
From: Matthieu Buffet @ 2026-05-02 12:43 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	konstantin.meskhidze, Tingmao Wang, netdev, Matthieu Buffet
In-Reply-To: <20260502124306.3975990-1-matthieu@buffet.re>

Add support for a first fine-grained UDP access right.
LANDLOCK_ACCESS_NET_BIND_UDP controls the ability to set the local port
of a UDP socket (via bind()). It will be useful for servers (to start
receiving datagrams), and for some clients that need to use a specific
source port (e.g. mDNS requires to use port 5353)

For obvious performance concerns, access control is only enforced when
configuring sockets, not when using them for common send/recv
operations.

Bump ABI to allow userspace to detect and use this new right.

Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
---
 include/uapi/linux/landlock.h                | 12 +++++++++---
 security/landlock/audit.c                    |  1 +
 security/landlock/limits.h                   |  2 +-
 security/landlock/net.c                      | 18 ++++++++++++------
 security/landlock/syscalls.c                 |  2 +-
 tools/testing/selftests/landlock/base_test.c |  4 ++--
 tools/testing/selftests/landlock/net_test.c  |  5 +++--
 7 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index 10a346e55e95..045b251ff1b4 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -201,9 +201,9 @@ struct landlock_net_port_attr {
 	 * with ``setsockopt(IP_LOCAL_PORT_RANGE)``.
 	 *
 	 * A Landlock rule with port 0 and the %LANDLOCK_ACCESS_NET_BIND_TCP
-	 * right means that requesting to bind on port 0 is allowed and it will
-	 * automatically translate to binding on a kernel-assigned ephemeral
-	 * port.
+	 * or %LANDLOCK_ACCESS_NET_BIND_UDP right means that requesting to bind
+	 * on port 0 is allowed and it will automatically translate to binding
+	 * on a kernel-assigned ephemeral port.
 	 */
 	__u64 port;
 };
@@ -373,10 +373,16 @@ struct landlock_net_port_attr {
  *   port. Support added in Landlock ABI version 4.
  * - %LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect TCP sockets to the given
  *   remote port. Support added in Landlock ABI version 4.
+ *
+ * And similarly for UDP port numbers:
+ *
+ * - %LANDLOCK_ACCESS_NET_BIND_UDP: Bind UDP sockets to the given local
+ *   port. Support added in Landlock ABI version 10.
  */
 /* clang-format off */
 #define LANDLOCK_ACCESS_NET_BIND_TCP			(1ULL << 0)
 #define LANDLOCK_ACCESS_NET_CONNECT_TCP			(1ULL << 1)
+#define LANDLOCK_ACCESS_NET_BIND_UDP			(1ULL << 2)
 /* clang-format on */
 
 /**
diff --git a/security/landlock/audit.c b/security/landlock/audit.c
index 8d0edf94037d..e676ebffeebe 100644
--- a/security/landlock/audit.c
+++ b/security/landlock/audit.c
@@ -45,6 +45,7 @@ static_assert(ARRAY_SIZE(fs_access_strings) == LANDLOCK_NUM_ACCESS_FS);
 static const char *const net_access_strings[] = {
 	[BIT_INDEX(LANDLOCK_ACCESS_NET_BIND_TCP)] = "net.bind_tcp",
 	[BIT_INDEX(LANDLOCK_ACCESS_NET_CONNECT_TCP)] = "net.connect_tcp",
+	[BIT_INDEX(LANDLOCK_ACCESS_NET_BIND_UDP)] = "net.bind_udp",
 };
 
 static_assert(ARRAY_SIZE(net_access_strings) == LANDLOCK_NUM_ACCESS_NET);
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index b454ad73b15e..c0f30a4591b8 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -23,7 +23,7 @@
 #define LANDLOCK_MASK_ACCESS_FS		((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
 #define LANDLOCK_NUM_ACCESS_FS		__const_hweight64(LANDLOCK_MASK_ACCESS_FS)
 
-#define LANDLOCK_LAST_ACCESS_NET	LANDLOCK_ACCESS_NET_CONNECT_TCP
+#define LANDLOCK_LAST_ACCESS_NET	LANDLOCK_ACCESS_NET_BIND_UDP
 #define LANDLOCK_MASK_ACCESS_NET	((LANDLOCK_LAST_ACCESS_NET << 1) - 1)
 #define LANDLOCK_NUM_ACCESS_NET		__const_hweight64(LANDLOCK_MASK_ACCESS_NET)
 
diff --git a/security/landlock/net.c b/security/landlock/net.c
index c368649985c5..f9ccb52e7d45 100644
--- a/security/landlock/net.c
+++ b/security/landlock/net.c
@@ -81,15 +81,17 @@ static int current_check_access_socket(struct socket *const sock,
 			 * inconsistencies and return -EINVAL if needed.
 			 */
 			return 0;
-		} else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP) {
+		} else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP ||
+			   access_request == LANDLOCK_ACCESS_NET_BIND_UDP) {
 			/*
 			 * Binding to an AF_UNSPEC address is treated
 			 * differently by IPv4 and IPv6 sockets. The socket's
 			 * family may change under our feet due to
 			 * setsockopt(IPV6_ADDRFORM), but that's ok: we either
-			 * reject entirely or require
-			 * %LANDLOCK_ACCESS_NET_BIND_TCP for the given port, so
-			 * it cannot be used to bypass the policy.
+			 * reject entirely for IPv6 or require
+			 * %LANDLOCK_ACCESS_NET_BIND_TCP or
+			 * %LANDLOCK_ACCESS_NET_BIND_UDP for IPv4,
+			 * so it cannot be used to bypass the policy.
 			 *
 			 * IPv4 sockets map AF_UNSPEC to AF_INET for
 			 * retrocompatibility for bind accesses, only if the
@@ -135,7 +137,8 @@ static int current_check_access_socket(struct socket *const sock,
 		if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP) {
 			audit_net.dport = port;
 			audit_net.v4info.daddr = addr4->sin_addr.s_addr;
-		} else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP) {
+		} else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP ||
+			   access_request == LANDLOCK_ACCESS_NET_BIND_UDP) {
 			audit_net.sport = port;
 			audit_net.v4info.saddr = addr4->sin_addr.s_addr;
 		} else {
@@ -157,7 +160,8 @@ static int current_check_access_socket(struct socket *const sock,
 		if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP) {
 			audit_net.dport = port;
 			audit_net.v6info.daddr = addr6->sin6_addr;
-		} else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP) {
+		} else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP ||
+			   access_request == LANDLOCK_ACCESS_NET_BIND_UDP) {
 			audit_net.sport = port;
 			audit_net.v6info.saddr = addr6->sin6_addr;
 		} else {
@@ -216,6 +220,8 @@ static int hook_socket_bind(struct socket *const sock,
 
 	if (sk_is_tcp(sock->sk))
 		access_request = LANDLOCK_ACCESS_NET_BIND_TCP;
+	else if (sk_is_udp(sock->sk))
+		access_request = LANDLOCK_ACCESS_NET_BIND_UDP;
 	else
 		return 0;
 
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index accfd2e5a0cd..d45469d5d464 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -166,7 +166,7 @@ static const struct file_operations ruleset_fops = {
  * If the change involves a fix that requires userspace awareness, also update
  * the errata documentation in Documentation/userspace-api/landlock.rst .
  */
-const int landlock_abi_version = 9;
+const int landlock_abi_version = 10;
 
 /**
  * sys_landlock_create_ruleset - Create a new ruleset
diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c
index 30d37234086c..6c8113c2ded1 100644
--- a/tools/testing/selftests/landlock/base_test.c
+++ b/tools/testing/selftests/landlock/base_test.c
@@ -76,8 +76,8 @@ TEST(abi_version)
 	const struct landlock_ruleset_attr ruleset_attr = {
 		.handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
 	};
-	ASSERT_EQ(9, landlock_create_ruleset(NULL, 0,
-					     LANDLOCK_CREATE_RULESET_VERSION));
+	ASSERT_EQ(10, landlock_create_ruleset(NULL, 0,
+					      LANDLOCK_CREATE_RULESET_VERSION));
 
 	ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0,
 					      LANDLOCK_CREATE_RULESET_VERSION));
diff --git a/tools/testing/selftests/landlock/net_test.c b/tools/testing/selftests/landlock/net_test.c
index 4c528154ea92..ec392d971ea3 100644
--- a/tools/testing/selftests/landlock/net_test.c
+++ b/tools/testing/selftests/landlock/net_test.c
@@ -1326,11 +1326,12 @@ FIXTURE_TEARDOWN(mini)
 
 /* clang-format off */
 
-#define ACCESS_LAST LANDLOCK_ACCESS_NET_CONNECT_TCP
+#define ACCESS_LAST LANDLOCK_ACCESS_NET_BIND_UDP
 
 #define ACCESS_ALL ( \
 	LANDLOCK_ACCESS_NET_BIND_TCP | \
-	LANDLOCK_ACCESS_NET_CONNECT_TCP)
+	LANDLOCK_ACCESS_NET_CONNECT_TCP | \
+	LANDLOCK_ACCESS_NET_BIND_UDP)
 
 /* clang-format on */
 
-- 
2.39.5


^ permalink raw reply related

* [PATCH 1/3] apparmor: Fix return in ns_mkdir_op
From: Hongling Zeng @ 2026-05-03  4:12 UTC (permalink / raw)
  To: john.johansen, paul, jmorris, serge, neil, brauner, jlayton, jack
  Cc: apparmor, linux-security-module, linux-kernel, zhongling0719,
	Hongling Zeng

Return NULL instead of passing to ERR_PTR while error is zero.
  Fixes smatch warning:
    - security/apparmor/apparmorfs.c:1846 ns_mkdir_op() warn:
      passing zero to 'ERR_PTR'

Fixes: 88d5baf69082 ("Change inode_operations.mkdir to return struct dentry *")
Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn>
---
 security/apparmor/apparmorfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
index ededaf46f3ca..1d7b1c70f22a 100644
--- a/security/apparmor/apparmorfs.c
+++ b/security/apparmor/apparmorfs.c
@@ -1922,7 +1922,7 @@ static struct dentry *ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir,
 	mutex_unlock(&parent->lock);
 	aa_put_ns(parent);
 
-	return ERR_PTR(error);
+	return error ? ERR_PTR(error) : NULL;
 }
 
 static int ns_rmdir_op(struct inode *dir, struct dentry *dentry)
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] ima: debugging late_initcall_sync measurements
From: Mimi Zohar @ 2026-05-03 11:36 UTC (permalink / raw)
  To: David Safford
  Cc: Yeoreum Yun, Jonathan McDowell, linux-security-module,
	linux-kernel, linux-integrity, linux-arm-kernel, kvmarm, paul,
	jmorris, serge, roberto.sassu, dmitry.kasatkin, eric.snowberg,
	jarkko, jgg, sudeep.holla, maz, oupton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, noodles,
	sebastianene
In-Reply-To: <CAGWfHUW+AX0Hpuw5Vr5iTSaJKQJ+O_4nWWmU1UR8Z_3XFctHZg@mail.gmail.com>

On Fri, 2026-05-01 at 12:52 -0400, David Safford wrote:
> On Thu, Apr 30, 2026 at 5:43 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > 
> > On Thu, 2026-04-30 at 10:48 +0100, Yeoreum Yun wrote:
> > > With above change I confirmed there is no meaurement log
> > > between boot_aggregate and boot_aggregate_late except "kernel_version"
> > > But this is ignorable since this UTS measurement is done in
> > > "ima_init_core() (old: ima_init())" and it is part of ima initialisation.
> > > 
> > > 1. ima_policy=tcb
> > > 
> > >   # cat /sys/kernel/security/ima/ascii_runtime_measurements
> > >   10 0adefe762c149c7cec19da62f0da1297fcfbffff ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate
> > >   10 4e5d73ebadfd8f850cb93ce4de755ba148a9a7d5 ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate_late
> > >   10 7c23cc970eceec906f7a41bc2fbde770d7092209 ima-ng sha256:72ade6ae3d35cfe5ede7a77b1c0ed1d1782a899445fdcb219c0e994a084a70d5 /bin/busybox
> snip
> > > 
> > > 2. ima_policy=critical_data
> > > 
> > >   # cat /sys/kernel/security/ima/ascii_runtime_measurements
> > >   10 0adefe762c149c7cec19da62f0da1297fcfbffff ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate
> > >   10 49ab61dd97ea2f759edcb6c6a3387ac67f0aa576 ima-buf sha256:0c907aab3261194f16b0c2a422a82f145bc9b9ecb8fdb633fa43e3e5379f0af2 kernel_version 372e312e302d7263312b // Ignorable since it's generated by ima_init(_core)().
> > >   10 4e5d73ebadfd8f850cb93ce4de755ba148a9a7d5 ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate_late
> > > 
> > > Therefore, init_ima() could move into late_initcall_sync like v1 did:
> > >   - https://lore.kernel.org/all/20260417175759.3191279-2-yeoreum.yun@arm.com/
> > 
> > Thanks, Yeoreum.  It's a bit premature to claim it's "safe" to move the
> > initcall.  Hopefully others will respond.
> > 
> > Mimi
> 
> I have also run with this patch on a number of bare metal and virtual machines,
> running everything from default Fedora 44 to a version with everything turned on
> (uefi secure boot, UKI with sdboot stub measurements, IMA measurement
> and appraisal enabled,
> all systemd measurements on, and systemd using the TPM for root
> partition decryption.)
> I too see only the kernel_version event between the normal and late
> calls, if ima_policy=critical_data.

Thanks, Dave!  Were all the systems you tested x86_64?  The next step would be
to test on different arch's (e.g. Z, Power).

Mimi

^ permalink raw reply

* Re: [PATCH] ima: debugging late_initcall_sync measurements
From: Mimi Zohar @ 2026-05-03 12:42 UTC (permalink / raw)
  To: David Safford
  Cc: Yeoreum Yun, Jonathan McDowell, linux-security-module,
	linux-kernel, linux-integrity, linux-arm-kernel, kvmarm, paul,
	jmorris, serge, roberto.sassu, dmitry.kasatkin, eric.snowberg,
	jarkko, jgg, sudeep.holla, maz, oupton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, noodles,
	sebastianene
In-Reply-To: <202f90682fe47bb5fb9b08f8678ae00981b5290b.camel@linux.ibm.com>

On Sun, 2026-05-03 at 07:36 -0400, Mimi Zohar wrote:
> On Fri, 2026-05-01 at 12:52 -0400, David Safford wrote:
> > On Thu, Apr 30, 2026 at 5:43 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > 
> > > On Thu, 2026-04-30 at 10:48 +0100, Yeoreum Yun wrote:
> > > > With above change I confirmed there is no meaurement log
> > > > between boot_aggregate and boot_aggregate_late except "kernel_version"
> > > > But this is ignorable since this UTS measurement is done in
> > > > "ima_init_core() (old: ima_init())" and it is part of ima initialisation.
> > > > 
> > > > 1. ima_policy=tcb
> > > > 
> > > >   # cat /sys/kernel/security/ima/ascii_runtime_measurements
> > > >   10 0adefe762c149c7cec19da62f0da1297fcfbffff ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate
> > > >   10 4e5d73ebadfd8f850cb93ce4de755ba148a9a7d5 ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate_late
> > > >   10 7c23cc970eceec906f7a41bc2fbde770d7092209 ima-ng sha256:72ade6ae3d35cfe5ede7a77b1c0ed1d1782a899445fdcb219c0e994a084a70d5 /bin/busybox
> > snip
> > > > 
> > > > 2. ima_policy=critical_data
> > > > 
> > > >   # cat /sys/kernel/security/ima/ascii_runtime_measurements
> > > >   10 0adefe762c149c7cec19da62f0da1297fcfbffff ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate
> > > >   10 49ab61dd97ea2f759edcb6c6a3387ac67f0aa576 ima-buf sha256:0c907aab3261194f16b0c2a422a82f145bc9b9ecb8fdb633fa43e3e5379f0af2 kernel_version 372e312e302d7263312b // Ignorable since it's generated by ima_init(_core)().
> > > >   10 4e5d73ebadfd8f850cb93ce4de755ba148a9a7d5 ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate_late
> > > > 
> > > > Therefore, init_ima() could move into late_initcall_sync like v1 did:
> > > >   - https://lore.kernel.org/all/20260417175759.3191279-2-yeoreum.yun@arm.com/
> > > 
> > > Thanks, Yeoreum.  It's a bit premature to claim it's "safe" to move the
> > > initcall.  Hopefully others will respond.
> > > 
> > > Mimi
> > 
> > I have also run with this patch on a number of bare metal and virtual machines,
> > running everything from default Fedora 44 to a version with everything turned on
> > (uefi secure boot, UKI with sdboot stub measurements, IMA measurement
> > and appraisal enabled,
> > all systemd measurements on, and systemd using the TPM for root
> > partition decryption.)
> > I too see only the kernel_version event between the normal and late
> > calls, if ima_policy=critical_data.
> 
> Thanks, Dave!  Were all the systems you tested x86_64?  The next step would be
> to test on different arch's (e.g. Z, Power).

On both Z and PowerVM, there are ~30 measurements between boot_aggregate and
boot_aggregate_late.  For example, on PowerVM:

# grep -n boot_aggregate
/sys/kernel/security/integrity/ima/ascii_runtime_measurements

1:10 f60a05d7354fb34aabc02965216abd3428ea52bb ima-sig
sha256:9887dd089ee19a6517bca10580b02c1bb9aa6cd86c157b6ead8a1c0403f348d5
boot_aggregate 
31:10 e2592b0d61da6300d3db447b143897a9792231ea ima-sig
sha256:9887dd089ee19a6517bca10580b02c1bb9aa6cd86c157b6ead8a1c0403f348d5
boot_aggregate_late

It would be interesting to the results from a Raspberry Pi 5 as well,
with/without a TPM.

Mimi

^ permalink raw reply

* Re: [PATCH] ima: debugging late_initcall_sync measurements
From: Paul Moore @ 2026-05-03 16:46 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Yeoreum Yun, Jonathan McDowell, linux-security-module,
	linux-kernel, linux-integrity, linux-arm-kernel, kvmarm, jmorris,
	serge, roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, jgg,
	sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, noodles, sebastianene
In-Reply-To: <ba4bf28314b679474a6a8da6298e548e54b3754c.camel@linux.ibm.com>

On Thu, Apr 30, 2026 at 9:51 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Thu, 2026-04-30 at 18:35 -0400, Paul Moore wrote:
> > On Thu, Apr 30, 2026 at 5:39 PM Mimi Zohar <zohar@linux.ibm.com> wrote:
> > > On Thu, 2026-04-30 at 10:48 +0100, Yeoreum Yun wrote:
> > > > With above change I confirmed there is no meaurement log
> > > > between boot_aggregate and boot_aggregate_late except "kernel_version"
> > > > But this is ignorable since this UTS measurement is done in
> > > > "ima_init_core() (old: ima_init())" and it is part of ima initialisation.
> > > >
> > > > 1. ima_policy=tcb
> > > >
> > > >   # cat /sys/kernel/security/ima/ascii_runtime_measurements
> > > >   10 0adefe762c149c7cec19da62f0da1297fcfbffff ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate
> > > >   10 4e5d73ebadfd8f850cb93ce4de755ba148a9a7d5 ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate_late
> > > >   10 7c23cc970eceec906f7a41bc2fbde770d7092209 ima-ng sha256:72ade6ae3d35cfe5ede7a77b1c0ed1d1782a899445fdcb219c0e994a084a70d5 /bin/busybox
> > > >   10 17ec669c65c401e5e85875cf2962eb7d8c47595f ima-ng sha256:dc6b013e9768d9b13bcd6678470448090138ca831f4771a43ce3988d8e54ffce /lib/ld-linux-aarch64.so.1
> > > >   10 58679a66ac1de17f02595625a8fbeafa259a4c81 ima-ng sha256:494f62bcfb2fcf1b427d5092fafa62c8df39a83b4a64402620b28846724f237f /usr/lib/libtirpc.so.3.0.0
> > > >   10 42f74ee200434576e33be153830b3d55bbe6d2bf ima-ng sha256:a18856b4f6927bc2b8dd4608c0768b8f98544a161b85bf4a64419131243ad300 /lib/libresolv.so.2
> > > >   10 626b4f7bd4f123d18d3a3d8719ed0ae19ee5f331 ima-ng sha256:b8d442de5d31c3f9d1bbb98785f04d4a23dc53442b286d85d4b355927cbe9af4 /lib/libc.so.6
> > > >   10 655a200869696207646377a58cab417fd35b09d2 ima-ng sha256:ad46146b6dd32b47213e5327f1bb2f962ef838a4b707ef7445fa2dbc9019b44f /etc/inittab
> > > >   10 81353202685e022fcd0069a3b2fc4eaa6b1db537 ima-ng sha256:74d698fe0a6862050af29083aa591c960ec1f67be960047e96bb6be5fc2bc0c0 /bin/mount
> > > >   10 ae64184ee607ef8f3aa08ab52cb548318534fd4b ima-ng sha256:27846b57e8234c6a9611b00351f581a54ad6f9a1920b9aa18ceb0ae28e4f7564 /lib/libmount.so.1.1.0
> > > >   10 5ea01f34e7705d1bdb936fd576e2aeb5fd78dab9 ima-ng sha256:3d2a414ec0355fcf0910224fb4a3c53e13d98731a35241edfdf4fb911ed9b210 /lib/libblkid.so.1.1.0
> > > >   10 22c48b4853594a08a73ad4ae6dbe6f2c2bebc6c5 ima-ng sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 /run/utmp
> > > >   10 3024ea5021f8a5d9fb4bd519d599bdca43b7fb93 ima-ng sha256:71ea9ffe2b30e5a9bdceff78785cf281cc41544474db8dc4605a06a597ce1edc /etc/fstab
> > > >   10 2e7530a0f56420991ac7611734cea4774b92b9ef ima-ng sha256:df4697d699442cfe73db7cc8b4c1b37e8a31e75e01f66a0d70134ac812fa683b /bin/mkdir
> > > >   10 3ad117a863aa1ed7b7c09e1d106f84abf7d2ae96 ima-ng sha256:c19a710989b43222431b02399273dba409fe10ca8eefff88eaa936fa695f8324 /bin/ln
> > > >   10 4141c82cb516ac3c846e0b08abcd6abeee7efa1a ima-ng sha256:b75d7f28772f71715a941c77e07e3922815391dd9cc5718ad21f2231c2da09bb /etc/hostname
> > > >   10 dfcedd3c7dc3ed42e09219804504489ab264e2e3 ima-ng sha256:dc1615df9f2012b20b81ffad8e07e16293039ba7fd897854ca3646d6cfea0c0f /etc/init.d/rcS
> > > >   ...
> > > >
> > > > 2. ima_policy=critical_data
> > > >
> > > >   # cat /sys/kernel/security/ima/ascii_runtime_measurements
> > > >   10 0adefe762c149c7cec19da62f0da1297fcfbffff ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate
> > > >   10 49ab61dd97ea2f759edcb6c6a3387ac67f0aa576 ima-buf sha256:0c907aab3261194f16b0c2a422a82f145bc9b9ecb8fdb633fa43e3e5379f0af2 kernel_version 372e312e302d7263312b // Ignorable since it's generated by ima_init(_core)().
> > > >   10 4e5d73ebadfd8f850cb93ce4de755ba148a9a7d5 ima-ng sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate_late
> > > >
> > > > Therefore, init_ima() could move into late_initcall_sync like v1 did:
> > > >   - https://lore.kernel.org/all/20260417175759.3191279-2-yeoreum.yun@arm.com/
> > >
> > > Thanks, Yeoreum.  It's a bit premature to claim it's "safe" to move the
> > > initcall.  Hopefully others will respond.
> >
> > Is it not possible to look at the code and determine if it is safe or
> > not?  Or is the initialization of TPM devices at boot done in a random
> > order with respect to the initcall levels?
>
> The TPM is normally initialized at the device_initcall, except when other
> resources are not ready.
>
> (Abbreviated) AI explanation:
>    If the TPM's first probe succeeds at device_initcall with no deferral, IMA
>    finds it fine. It is only when the TPM is pushed onto the deferred list that
>    late_initcall can execute before the retry succeeds, leaving
>    tpm_default_chip() returning NULL.

I really hope you are using AI only to phrase a response and not as a
substitute for actually investigating the code and determining what is
happening.

Regardless, assuming you always want IMA to leverage a TPMs when they
exist, your reply suggests that using an initcall based IMA init
scheme, even a late-sync initcall, may not be sufficient because
deferred TPM initialization could happen later, yes?

-- 
paul-moore.com

^ permalink raw reply

* [PATCH v2 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: David Windsor @ 2026-05-03 21:18 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman,
	Kumar Kartikeya Dwivedi, KP Singh, Matt Bobrowski, Paul Moore,
	James Morris, Serge E. Hallyn, Mimi Zohar, Roberto Sassu,
	Dmitry Kasatkin, Stephen Smalley, Casey Schaufler
  Cc: Song Liu, Jan Kara, John Fastabend, Martin KaFai Lau,
	Yonghong Song, Jiri Olsa, Eric Snowberg, Ondrej Mosnacek,
	linux-fsdevel, linux-kernel, bpf, linux-security-module,
	linux-integrity, selinux
In-Reply-To: <20260503211835.16103-1-dwindsor@gmail.com>

Add bpf_init_inode_xattr() kfunc for BPF LSM programs to atomically set
xattrs via the inode_init_security hook using lsm_get_xattr_slot().

The inode_init_security hook previously took the xattr array and count
as two separate output parameters (struct xattr *xattrs, int
*xattr_count), which BPF programs cannot write to. Pass the xattr state
as a single context object (struct lsm_xattr_ctx) instead, and have
bpf_init_inode_xattr() take that context directly. Update the existing
in-tree callers of inode_init_security to take and forward the new
lsm_xattr_ctx.

Because we rely on the hook-specific ctx layout, the kfunc is
restricted to lsm/inode_init_security. Restrict the xattr names that
may be set via this kfunc to the bpf.* namespace.

Suggested-by: Song Liu <song@kernel.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
---
 fs/bpf_fs_kfuncs.c                | 106 +++++++++++++++++++++++++++++-
 include/linux/bpf_lsm.h           |   3 +
 include/linux/evm.h               |   9 +--
 include/linux/lsm_hook_defs.h     |   4 +-
 include/linux/lsm_hooks.h         |  16 ++---
 include/linux/security.h          |   5 ++
 kernel/bpf/bpf_lsm.c              |   1 +
 security/bpf/hooks.c              |   1 +
 security/integrity/evm/evm_main.c |   8 ++-
 security/security.c               |   7 +-
 security/selinux/hooks.c          |   4 +-
 security/smack/smack_lsm.c        |  13 ++--
 12 files changed, 147 insertions(+), 30 deletions(-)

diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
index 9d27be058494..193accc00796 100644
--- a/fs/bpf_fs_kfuncs.c
+++ b/fs/bpf_fs_kfuncs.c
@@ -10,6 +10,7 @@
 #include <linux/fsnotify.h>
 #include <linux/file.h>
 #include <linux/kernfs.h>
+#include <linux/lsm_hooks.h>
 #include <linux/mm.h>
 #include <linux/xattr.h>
 
@@ -353,6 +354,97 @@ __bpf_kfunc int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__s
 }
 #endif /* CONFIG_CGROUPS */
 
+static int bpf_xattrs_used(const struct lsm_xattr_ctx *ctx)
+{
+	const size_t prefix_len = sizeof(XATTR_BPF_LSM_SUFFIX) - 1;
+	int i, n = 0;
+
+	for (i = 0; i < *ctx->xattr_count; i++) {
+		const char *name = ctx->xattrs[i].name;
+
+		if (name && !strncmp(name, XATTR_BPF_LSM_SUFFIX, prefix_len))
+			n++;
+	}
+	return n;
+}
+
+static int __bpf_init_inode_xattr(struct lsm_xattr_ctx *xattr_ctx,
+				  const char *name__str,
+				  const struct bpf_dynptr *value_p)
+{
+	struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
+	size_t name_len;
+	void *xattr_value;
+	struct xattr *xattr;
+	struct xattr *xattrs;
+	int *xattr_count;
+	const void *value;
+	u32 value_len;
+
+	if (!xattr_ctx || !name__str)
+		return -EINVAL;
+
+	xattrs = xattr_ctx->xattrs;
+	xattr_count = xattr_ctx->xattr_count;
+	if (!xattrs || !xattr_count)
+		return -EINVAL;
+	if (bpf_xattrs_used(xattr_ctx) >= BPF_LSM_INODE_INIT_XATTRS)
+		return -ENOSPC;
+
+	name_len = strlen(name__str);
+	if (name_len == 0 || name_len > XATTR_NAME_MAX)
+		return -EINVAL;
+	if (strncmp(name__str, XATTR_BPF_LSM_SUFFIX,
+		    sizeof(XATTR_BPF_LSM_SUFFIX) - 1))
+		return -EPERM;
+
+	value_len = __bpf_dynptr_size(value_ptr);
+	if (value_len == 0 || value_len > XATTR_SIZE_MAX)
+		return -EINVAL;
+
+	value = __bpf_dynptr_data(value_ptr, value_len);
+	if (!value)
+		return -EINVAL;
+
+	/* Combine xattr value + name into one allocation. */
+	xattr_value = kmalloc(value_len + name_len + 1, GFP_KERNEL);
+	if (!xattr_value)
+		return -ENOMEM;
+
+	memcpy(xattr_value, value, value_len);
+	memcpy(xattr_value + value_len, name__str, name_len);
+	((char *)xattr_value)[value_len + name_len] = '\0';
+
+	xattr = lsm_get_xattr_slot(xattr_ctx);
+	if (!xattr) {
+		kfree(xattr_value);
+		return -ENOSPC;
+	}
+
+	xattr->value = xattr_value;
+	xattr->name = (const char *)xattr_value + value_len;
+	xattr->value_len = value_len;
+
+	return 0;
+}
+
+/**
+ * bpf_init_inode_xattr - set an xattr on a new inode from inode_init_security
+ * @xattr_ctx: inode_init_security xattr state from the hook context
+ * @name__str: xattr name (e.g., "bpf.file_label")
+ * @value_p: dynptr containing the xattr value
+ *
+ * Only callable from lsm/inode_init_security programs.
+ *
+ * Return: 0 on success, negative error on failure.
+ */
+__bpf_kfunc int bpf_init_inode_xattr(struct lsm_xattr_ctx *xattr_ctx,
+				     const char *name__str,
+				     const struct bpf_dynptr *value_p)
+{
+	return __bpf_init_inode_xattr(xattr_ctx, name__str, value_p);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(bpf_fs_kfunc_set_ids)
@@ -363,13 +455,25 @@ BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_set_dentry_xattr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_remove_dentry_xattr, KF_SLEEPABLE)
+BTF_ID_FLAGS(func, bpf_init_inode_xattr, KF_SLEEPABLE)
 BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)
 
+BTF_ID_LIST(bpf_lsm_inode_init_security_btf_ids)
+BTF_ID(func, bpf_lsm_inode_init_security)
+
+BTF_ID_LIST(bpf_init_inode_xattr_btf_ids)
+BTF_ID(func, bpf_init_inode_xattr)
+
 static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
 {
 	if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
-	    prog->type == BPF_PROG_TYPE_LSM)
+	    prog->type == BPF_PROG_TYPE_LSM) {
+		/* bpf_init_inode_xattr only attaches to inode_init_security. */
+		if (kfunc_id == bpf_init_inode_xattr_btf_ids[0] &&
+		    prog->aux->attach_btf_id != bpf_lsm_inode_init_security_btf_ids[0])
+			return -EACCES;
 		return 0;
+	}
 	return -EACCES;
 }
 
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index 643809cc78c3..b97a3d79529d 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -19,6 +19,9 @@
 #include <linux/lsm_hook_defs.h>
 #undef LSM_HOOK
 
+/* max bpf xattrs per inode */
+#define BPF_LSM_INODE_INIT_XATTRS 1
+
 struct bpf_storage_blob {
 	struct bpf_local_storage __rcu *storage;
 };
diff --git a/include/linux/evm.h b/include/linux/evm.h
index 913f4573b203..dff930bc10ba 100644
--- a/include/linux/evm.h
+++ b/include/linux/evm.h
@@ -12,6 +12,8 @@
 #include <linux/integrity.h>
 #include <linux/xattr.h>
 
+struct lsm_xattr_ctx;
+
 #ifdef CONFIG_EVM
 extern int evm_set_key(void *key, size_t keylen);
 extern enum integrity_status evm_verifyxattr(struct dentry *dentry,
@@ -21,8 +23,8 @@ extern enum integrity_status evm_verifyxattr(struct dentry *dentry,
 int evm_fix_hmac(struct dentry *dentry, const char *xattr_name,
 		 const char *xattr_value, size_t xattr_value_len);
 int evm_inode_init_security(struct inode *inode, struct inode *dir,
-			    const struct qstr *qstr, struct xattr *xattrs,
-			    int *xattr_count);
+			    const struct qstr *qstr,
+			    struct lsm_xattr_ctx *xattr_ctx);
 extern bool evm_revalidate_status(const char *xattr_name);
 extern int evm_protected_xattr_if_enabled(const char *req_xattr_name);
 extern int evm_read_protected_xattrs(struct dentry *dentry, u8 *buffer,
@@ -63,8 +65,7 @@ static inline int evm_fix_hmac(struct dentry *dentry, const char *xattr_name,
 
 static inline int evm_inode_init_security(struct inode *inode, struct inode *dir,
 					  const struct qstr *qstr,
-					  struct xattr *xattrs,
-					  int *xattr_count)
+					  struct lsm_xattr_ctx *xattr_ctx)
 {
 	return 0;
 }
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 2b8dfb35caed..0df364ebb0a5 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -116,8 +116,8 @@ LSM_HOOK(int, 0, inode_alloc_security, struct inode *inode)
 LSM_HOOK(void, LSM_RET_VOID, inode_free_security, struct inode *inode)
 LSM_HOOK(void, LSM_RET_VOID, inode_free_security_rcu, void *inode_security)
 LSM_HOOK(int, -EOPNOTSUPP, inode_init_security, struct inode *inode,
-	 struct inode *dir, const struct qstr *qstr, struct xattr *xattrs,
-	 int *xattr_count)
+	 struct inode *dir, const struct qstr *qstr,
+	 struct lsm_xattr_ctx *xattr_ctx)
 LSM_HOOK(int, 0, inode_init_security_anon, struct inode *inode,
 	 const struct qstr *name, const struct inode *context_inode)
 LSM_HOOK(int, 0, inode_create, struct inode *dir, struct dentry *dentry,
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index b4f8cad53ddb..2133b729e87d 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -200,20 +200,18 @@ extern struct lsm_static_calls_table static_calls_table __ro_after_init;
 
 /**
  * lsm_get_xattr_slot - Return the next available slot and increment the index
- * @xattrs: array storing LSM-provided xattrs
- * @xattr_count: number of already stored xattrs (updated)
+ * @ctx: xattr state shared by inode_init_security hooks
  *
- * Retrieve the first available slot in the @xattrs array to fill with an xattr,
- * and increment @xattr_count.
+ * Retrieve the first available slot in the @ctx->xattrs array to fill with an
+ * xattr, and increment @ctx->xattr_count.
  *
- * Return: The slot to fill in @xattrs if non-NULL, NULL otherwise.
+ * Return: The slot to fill in @ctx->xattrs if non-NULL, NULL otherwise.
  */
-static inline struct xattr *lsm_get_xattr_slot(struct xattr *xattrs,
-					       int *xattr_count)
+static inline struct xattr *lsm_get_xattr_slot(struct lsm_xattr_ctx *ctx)
 {
-	if (unlikely(!xattrs))
+	if (unlikely(!ctx || !ctx->xattrs || !ctx->xattr_count))
 		return NULL;
-	return &xattrs[(*xattr_count)++];
+	return &ctx->xattrs[(*ctx->xattr_count)++];
 }
 
 #endif /* ! __LINUX_LSM_HOOKS_H */
diff --git a/include/linux/security.h b/include/linux/security.h
index 41d7367cf403..a2fc72e63ada 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -68,6 +68,11 @@ struct watch;
 struct watch_notification;
 struct lsm_ctx;
 
+struct lsm_xattr_ctx {
+	struct xattr *xattrs;
+	int *xattr_count;
+};
+
 /* Default (no) options for the capable function */
 #define CAP_OPT_NONE 0x0
 /* If capable should audit the security request */
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index c5c925f00202..fbbb4e1c04fc 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -315,6 +315,7 @@ BTF_ID(func, bpf_lsm_inode_create)
 BTF_ID(func, bpf_lsm_inode_free_security)
 BTF_ID(func, bpf_lsm_inode_getattr)
 BTF_ID(func, bpf_lsm_inode_getxattr)
+BTF_ID(func, bpf_lsm_inode_init_security)
 BTF_ID(func, bpf_lsm_inode_mknod)
 BTF_ID(func, bpf_lsm_inode_need_killpriv)
 BTF_ID(func, bpf_lsm_inode_post_setxattr)
diff --git a/security/bpf/hooks.c b/security/bpf/hooks.c
index 40efde233f3a..d7c44c5c0e30 100644
--- a/security/bpf/hooks.c
+++ b/security/bpf/hooks.c
@@ -30,6 +30,7 @@ static int __init bpf_lsm_init(void)
 
 struct lsm_blob_sizes bpf_lsm_blob_sizes __ro_after_init = {
 	.lbs_inode = sizeof(struct bpf_storage_blob),
+	.lbs_xattr_count = BPF_LSM_INODE_INIT_XATTRS,
 };
 
 DEFINE_LSM(bpf) = {
diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
index b59e3f121b8a..c25301f25a0a 100644
--- a/security/integrity/evm/evm_main.c
+++ b/security/integrity/evm/evm_main.c
@@ -1062,14 +1062,16 @@ static int evm_inode_copy_up_xattr(struct dentry *src, const char *name)
  * evm_inode_init_security - initializes security.evm HMAC value
  */
 int evm_inode_init_security(struct inode *inode, struct inode *dir,
-			    const struct qstr *qstr, struct xattr *xattrs,
-			    int *xattr_count)
+			    const struct qstr *qstr,
+			    struct lsm_xattr_ctx *xattr_ctx)
 {
 	struct evm_xattr *xattr_data;
 	struct xattr *xattr, *evm_xattr;
+	struct xattr *xattrs;
 	bool evm_protected_xattrs = false;
 	int rc;
 
+	xattrs = xattr_ctx ? xattr_ctx->xattrs : NULL;
 	if (!(evm_initialized & EVM_INIT_HMAC) || !xattrs)
 		return 0;
 
@@ -1087,7 +1089,7 @@ int evm_inode_init_security(struct inode *inode, struct inode *dir,
 	if (!evm_protected_xattrs)
 		return 0;
 
-	evm_xattr = lsm_get_xattr_slot(xattrs, xattr_count);
+	evm_xattr = lsm_get_xattr_slot(xattr_ctx);
 	/*
 	 * Array terminator (xattr name = NULL) must be the first non-filled
 	 * xattr slot.
diff --git a/security/security.c b/security/security.c
index 4e999f023651..4cd43914ce93 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1334,6 +1334,7 @@ int security_inode_init_security(struct inode *inode, struct inode *dir,
 {
 	struct lsm_static_call *scall;
 	struct xattr *new_xattrs = NULL;
+	struct lsm_xattr_ctx xattr_ctx;
 	int ret = -EOPNOTSUPP, xattr_count = 0;
 
 	if (unlikely(IS_PRIVATE(inode)))
@@ -1349,10 +1350,12 @@ int security_inode_init_security(struct inode *inode, struct inode *dir,
 		if (!new_xattrs)
 			return -ENOMEM;
 	}
+	xattr_ctx.xattrs = new_xattrs;
+	xattr_ctx.xattr_count = &xattr_count;
 
 	lsm_for_each_hook(scall, inode_init_security) {
-		ret = scall->hl->hook.inode_init_security(inode, dir, qstr, new_xattrs,
-						  &xattr_count);
+		ret = scall->hl->hook.inode_init_security(inode, dir, qstr,
+							  &xattr_ctx);
 		if (ret && ret != -EOPNOTSUPP)
 			goto out;
 		/*
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 97801966bf32..dca81a22bf83 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2962,11 +2962,11 @@ static int selinux_dentry_create_files_as(struct dentry *dentry, int mode,
 
 static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
 				       const struct qstr *qstr,
-				       struct xattr *xattrs, int *xattr_count)
+				       struct lsm_xattr_ctx *xattr_ctx)
 {
 	const struct cred_security_struct *crsec = selinux_cred(current_cred());
 	struct superblock_security_struct *sbsec;
-	struct xattr *xattr = lsm_get_xattr_slot(xattrs, xattr_count);
+	struct xattr *xattr = lsm_get_xattr_slot(xattr_ctx);
 	u32 newsid, clen;
 	u16 newsclass;
 	int rc;
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 3f9ae05039a2..ea9549c666a1 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -981,10 +981,10 @@ smk_rule_transmutes(struct smack_known *subject,
 }
 
 static int
-xattr_dupval(struct xattr *xattrs, int *xattr_count,
+xattr_dupval(struct lsm_xattr_ctx *xattr_ctx,
 	     const char *name, const void *value, unsigned int vallen)
 {
-	struct xattr * const xattr = lsm_get_xattr_slot(xattrs, xattr_count);
+	struct xattr * const xattr = lsm_get_xattr_slot(xattr_ctx);
 
 	if (!xattr)
 		return 0;
@@ -1003,14 +1003,13 @@ xattr_dupval(struct xattr *xattrs, int *xattr_count,
  * @inode: the newly created inode
  * @dir: containing directory object
  * @qstr: unused
- * @xattrs: where to put the attributes
- * @xattr_count: current number of LSM-provided xattrs (updated)
+ * @xattr_ctx: where to put attributes and update count
  *
  * Returns 0 if it all works out, -ENOMEM if there's no memory
  */
 static int smack_inode_init_security(struct inode *inode, struct inode *dir,
 				     const struct qstr *qstr,
-				     struct xattr *xattrs, int *xattr_count)
+				     struct lsm_xattr_ctx *xattr_ctx)
 {
 	struct task_smack *tsp = smack_cred(current_cred());
 	struct inode_smack * const issp = smack_inode(inode);
@@ -1057,7 +1056,7 @@ static int smack_inode_init_security(struct inode *inode, struct inode *dir,
 		if (S_ISDIR(inode->i_mode)) {
 			transflag = SMK_INODE_TRANSMUTE;
 
-			if (xattr_dupval(xattrs, xattr_count,
+			if (xattr_dupval(xattr_ctx,
 				XATTR_SMACK_TRANSMUTE,
 				TRANS_TRUE,
 				TRANS_TRUE_SIZE
@@ -1067,7 +1066,7 @@ static int smack_inode_init_security(struct inode *inode, struct inode *dir,
 	}
 
 	if (rc == 0)
-		if (xattr_dupval(xattrs, xattr_count,
+		if (xattr_dupval(xattr_ctx,
 			    XATTR_SMACK_SUFFIX,
 			    issp->smk_inode->smk_known,
 		     strlen(issp->smk_inode->smk_known)
-- 
2.53.0


^ permalink raw reply related

* [PATCH RESEND] keys: use kmalloc_flex in user_preparse
From: Thorsten Blum @ 2026-05-04  9:31 UTC (permalink / raw)
  To: David Howells, Jarkko Sakkinen, Paul Moore, James Morris,
	Serge E. Hallyn
  Cc: linux-hardening, Thorsten Blum, keyrings, linux-security-module,
	linux-kernel

Use kmalloc_flex() when allocating a new struct user_key_payload in
user_preparse() to replace the open-coded size arithmetic and to keep
the size type-safe.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
---
 security/keys/user_defined.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/keys/user_defined.c b/security/keys/user_defined.c
index 686d56e4cc85..6f88b507f927 100644
--- a/security/keys/user_defined.c
+++ b/security/keys/user_defined.c
@@ -64,7 +64,7 @@ int user_preparse(struct key_preparsed_payload *prep)
 	if (datalen == 0 || datalen > 32767 || !prep->data)
 		return -EINVAL;
 
-	upayload = kmalloc(sizeof(*upayload) + datalen, GFP_KERNEL);
+	upayload = kmalloc_flex(*upayload, data, datalen);
 	if (!upayload)
 		return -ENOMEM;
 

^ permalink raw reply related

* Re: [PATCH] ima: debugging late_initcall_sync measurements
From: Mimi Zohar @ 2026-05-04 12:02 UTC (permalink / raw)
  To: Paul Moore
  Cc: Yeoreum Yun, Jonathan McDowell, linux-security-module,
	linux-kernel, linux-integrity, linux-arm-kernel, kvmarm, jmorris,
	serge, roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, jgg,
	sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, noodles, sebastianene
In-Reply-To: <CAHC9VhRE2kRr1fdDf6xgQgpSrtvqtP8Vy9LVGJhDZFUbzLKGmQ@mail.gmail.com>

On Sun, 2026-05-03 at 12:46 -0400, Paul Moore wrote:
> Regardless, assuming you always want IMA to leverage a TPMs when they
> exist, your reply suggests that using an initcall based IMA init
> scheme, even a late-sync initcall, may not be sufficient because
> deferred TPM initialization could happen later, yes?

Well yeah.  The TPM could be configured as a module, but that scenario is not of
interest.  That's way too late.  The case being addressed in this patch set is
when the TPM driver tries to initialize at device_initcall, returns
EPROBE_DEFER, and is retried at deferred_probe_initcall (late_initcall).  Since
ordering within an initcall is not supported, this patch attempts to initialize
IMA at late_initcall and similarly retries, in this case, at late_initcall_sync.

Mimi

^ permalink raw reply

* Re: [PATCH v5 09/13] ima: Add support for staging measurements with prompt
From: Roberto Sassu @ 2026-05-04 12:51 UTC (permalink / raw)
  To: corbet, skhan, zohar, dmitry.kasatkin, eric.snowberg, paul,
	jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, chenste, nramas, Roberto Sassu
In-Reply-To: <20260429160319.4162918-10-roberto.sassu@huaweicloud.com>

On Wed, 2026-04-29 at 18:03 +0200, Roberto Sassu wrote:
> From: Roberto Sassu <roberto.sassu@huawei.com>
> 
> Introduce the ability of staging the IMA measurement list and deleting them
> with a prompt.
> 
> Staging means moving the current content of the measurement list to a
> separate location, and allowing users to read and delete it. This causes
> the measurement list to be atomically truncated before new measurements can
> be added. Staging can be done only once at a time. In the event of kexec(),
> staging is reverted and staged entries will be carried over to the new
> kernel.
> 
> Introduce ascii_runtime_measurements_<algo>_staged and
> binary_runtime_measurements_<algo>_staged interfaces to access and delete
> the measurements. Also, add write permission to the original measurement
> interfaces.
> 
> Use 'echo A > <IMA original interface>' and
> 'echo D > <IMA _staged interface>' to respectively stage and delete the
> entire measurements list. Locking of these interfaces is also mediated with
> a call to _ima_measurements_open() and with ima_measurements_release().

While doing the staging in the original interface looks more intuitive,
since it is interface the user operates on, it causes loss of
transaction atomicity.

An agent opening the original interface has to close it, open the
staged interface to read and delete the staged measurement. Other
agents can open the staged interface first and do operations the
original agent didn't intend to do.

Will restore the previous behavior of staging/reading/deleting on the
staged interface. Will keep deleting N entries on the original
interface, since there is no risk of races.

Roberto

> Implement the staging functionality by introducing the new global
> measurements list ima_measurements_staged, and ima_queue_stage() and
> ima_queue_staged_delete_all() to respectively move measurements from the
> current measurements list to the staged one, and to move staged
> measurements to the ima_measurements_trim list for deletion. Introduce
> ima_queue_delete() to delete the measurements.
> 
> Finally, introduce the BINARY_STAGED and BINARY_FULL binary measurements
> list types, to maintain the counters and the binary size of staged
> measurements and the full measurements list (including entries that were
> staged). BINARY still represents the current binary measurements list.
> 
> Use the binary size for the BINARY + BINARY_STAGED types in
> ima_add_kexec_buffer(), since both measurements list types are copied to
> the secondary kernel during kexec. Use BINARY_FULL in
> ima_measure_kexec_event(), to generate a critical data record.
> 
> It should be noted that the BINARY_FULL counter is not passed through
> kexec. Thus, the number of entries included in the kexec critical data
> records refers to the entries since the previous kexec records.
> 
> Note: This code derives from the Alt-IMA Huawei project, whose license is
>       GPL-2.0 OR MIT.
> 
> Link: https://github.com/linux-integrity/linux/issues/1
> Suggested-by: Gregory Lumen <gregorylumen@linux.microsoft.com> (staging revert)
> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
> ---
>  security/integrity/ima/Kconfig     |  13 +++
>  security/integrity/ima/ima.h       |   8 +-
>  security/integrity/ima/ima_fs.c    | 181 ++++++++++++++++++++++++++---
>  security/integrity/ima/ima_kexec.c |  24 +++-
>  security/integrity/ima/ima_queue.c |  97 +++++++++++++++-
>  5 files changed, 302 insertions(+), 21 deletions(-)
> 
> diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
> index 862fbee2b174..48c906793efb 100644
> --- a/security/integrity/ima/Kconfig
> +++ b/security/integrity/ima/Kconfig
> @@ -332,4 +332,17 @@ config IMA_KEXEC_EXTRA_MEMORY_KB
>  	  If set to the default value of 0, an extra half page of memory for those
>  	  additional measurements will be allocated.
>  
> +config IMA_STAGING
> +	bool "Support for staging the measurements list"
> +	default y
> +	help
> +	  Add support for staging the measurements list.
> +
> +	  It allows user space to stage the measurements list for deletion and
> +	  to delete the staged measurements after confirmation.
> +
> +	  On kexec, staging is reverted and staged measurements are prepended
> +	  to the current measurements list when measurements are copied to the
> +	  secondary kernel.
> +
>  endif
> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> index f8ab6b604c0d..ca8fa43ec72b 100644
> --- a/security/integrity/ima/ima.h
> +++ b/security/integrity/ima/ima.h
> @@ -30,9 +30,11 @@ enum tpm_pcrs { TPM_PCR0 = 0, TPM_PCR8 = 8, TPM_PCR10 = 10 };
>  
>  /*
>   * BINARY: current binary measurements list
> + * BINARY_STAGED: staged binary measurements list
> + * BINARY_FULL: binary measurements list since IMA init (lost after kexec)
>   */
>  enum binary_lists {
> -	BINARY, BINARY__LAST
> +	BINARY, BINARY_STAGED, BINARY_FULL, BINARY__LAST
>  };
>  
>  /* digest size for IMA, fits SHA1 or MD5 */
> @@ -125,6 +127,7 @@ struct ima_queue_entry {
>  	struct ima_template_entry *entry;
>  };
>  extern struct list_head ima_measurements;	/* list of all measurements */
> +extern struct list_head ima_measurements_staged; /* list of staged meas. */
>  
>  /* Some details preceding the binary serialized measurement list */
>  struct ima_kexec_hdr {
> @@ -315,6 +318,8 @@ struct ima_template_desc *ima_template_desc_current(void);
>  struct ima_template_desc *ima_template_desc_buf(void);
>  struct ima_template_desc *lookup_template_desc(const char *name);
>  bool ima_template_has_modsig(const struct ima_template_desc *ima_template);
> +int ima_queue_stage(void);
> +int ima_queue_staged_delete_all(void);
>  int ima_restore_measurement_entry(struct ima_template_entry *entry);
>  int ima_restore_measurement_list(loff_t bufsize, void *buf);
>  int ima_measurements_show(struct seq_file *m, void *v);
> @@ -335,6 +340,7 @@ extern spinlock_t ima_queue_lock;
>  extern atomic_long_t ima_num_entries[BINARY__LAST];
>  extern atomic_long_t ima_num_violations;
>  extern struct hlist_head __rcu *ima_htable;
> +extern struct mutex ima_extend_list_mutex;
>  
>  static inline unsigned int ima_hash_key(u8 *digest)
>  {
> diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
> index 7709a4576322..088d5a69aa92 100644
> --- a/security/integrity/ima/ima_fs.c
> +++ b/security/integrity/ima/ima_fs.c
> @@ -24,6 +24,13 @@
>  
>  #include "ima.h"
>  
> +/*
> + * Requests:
> + * 'A\n': stage the entire measurements list
> + * 'D\n': delete all staged measurements
> + */
> +#define STAGED_REQ_LENGTH 21
> +
>  static DEFINE_MUTEX(ima_write_mutex);
>  static DEFINE_MUTEX(ima_measure_mutex);
>  static long ima_measure_users;
> @@ -97,6 +104,11 @@ static void *ima_measurements_start(struct seq_file *m, loff_t *pos)
>  	return _ima_measurements_start(m, pos, &ima_measurements);
>  }
>  
> +static void *ima_measurements_staged_start(struct seq_file *m, loff_t *pos)
> +{
> +	return _ima_measurements_start(m, pos, &ima_measurements_staged);
> +}
> +
>  static void *_ima_measurements_next(struct seq_file *m, void *v, loff_t *pos,
>  				    struct list_head *head)
>  {
> @@ -118,6 +130,12 @@ static void *ima_measurements_next(struct seq_file *m, void *v, loff_t *pos)
>  	return _ima_measurements_next(m, v, pos, &ima_measurements);
>  }
>  
> +static void *ima_measurements_staged_next(struct seq_file *m, void *v,
> +					  loff_t *pos)
> +{
> +	return _ima_measurements_next(m, v, pos, &ima_measurements_staged);
> +}
> +
>  static void ima_measurements_stop(struct seq_file *m, void *v)
>  {
>  }
> @@ -211,6 +229,13 @@ static const struct seq_operations ima_measurments_seqops = {
>  	.show = ima_measurements_show
>  };
>  
> +static const struct seq_operations ima_measurments_staged_seqops = {
> +	.start = ima_measurements_staged_start,
> +	.next = ima_measurements_staged_next,
> +	.stop = ima_measurements_stop,
> +	.show = ima_measurements_show
> +};
> +
>  static int ima_measure_lock(bool write)
>  {
>  	mutex_lock(&ima_measure_mutex);
> @@ -276,9 +301,78 @@ static int ima_measurements_release(struct inode *inode, struct file *file)
>  	return ret;
>  }
>  
> +static int ima_measurements_staged_open(struct inode *inode, struct file *file)
> +{
> +	return _ima_measurements_open(inode, file,
> +				      &ima_measurments_staged_seqops);
> +}
> +
> +static ssize_t _ima_measurements_write(struct file *file,
> +				       const char __user *buf, size_t datalen,
> +				       loff_t *ppos, bool staged_interface)
> +{
> +	char req[STAGED_REQ_LENGTH];
> +	int ret;
> +
> +	if (*ppos > 0 || datalen < 2 || datalen > STAGED_REQ_LENGTH)
> +		return -EINVAL;
> +
> +	if (copy_from_user(req, buf, datalen) != 0)
> +		return -EFAULT;
> +
> +	if (req[datalen - 1] != '\n')
> +		return -EINVAL;
> +
> +	req[datalen - 1] = '\0';
> +
> +	switch (req[0]) {
> +	case 'A':
> +		if (datalen != 2 || staged_interface)
> +			return -EINVAL;
> +
> +		ret = ima_queue_stage();
> +		break;
> +	case 'D':
> +		if (datalen != 2 || !staged_interface)
> +			return -EINVAL;
> +
> +		ret = ima_queue_staged_delete_all();
> +		break;
> +	default:
> +		ret = -EINVAL;
> +	}
> +
> +	if (ret < 0)
> +		return ret;
> +
> +	return datalen;
> +}
> +
> +static ssize_t ima_measurements_write(struct file *file, const char __user *buf,
> +				      size_t datalen, loff_t *ppos)
> +{
> +	return _ima_measurements_write(file, buf, datalen, ppos, false);
> +}
> +
> +static ssize_t ima_measurements_staged_write(struct file *file,
> +					     const char __user *buf,
> +					     size_t datalen, loff_t *ppos)
> +{
> +	return _ima_measurements_write(file, buf, datalen, ppos, true);
> +}
> +
>  static const struct file_operations ima_measurements_ops = {
>  	.open = ima_measurements_open,
>  	.read = seq_read,
> +	.write = ima_measurements_write,
> +	.llseek = seq_lseek,
> +	.release = ima_measurements_release,
> +};
> +
> +static const struct file_operations ima_measurements_staged_ops = {
> +	.open = ima_measurements_staged_open,
> +	.read = seq_read,
> +	.write = ima_measurements_staged_write,
>  	.llseek = seq_lseek,
>  	.release = ima_measurements_release,
>  };
> @@ -352,6 +446,29 @@ static int ima_ascii_measurements_open(struct inode *inode, struct file *file)
>  static const struct file_operations ima_ascii_measurements_ops = {
>  	.open = ima_ascii_measurements_open,
>  	.read = seq_read,
> +	.write = ima_measurements_write,
> +	.llseek = seq_lseek,
> +	.release = ima_measurements_release,
> +};
> +
> +static const struct seq_operations ima_ascii_measurements_staged_seqops = {
> +	.start = ima_measurements_staged_start,
> +	.next = ima_measurements_staged_next,
> +	.stop = ima_measurements_stop,
> +	.show = ima_ascii_measurements_show
> +};
> +
> +static int ima_ascii_measurements_staged_open(struct inode *inode,
> +					      struct file *file)
> +{
> +	return _ima_measurements_open(inode, file,
> +				      &ima_ascii_measurements_staged_seqops);
> +}
> +
> +static const struct file_operations ima_ascii_measurements_staged_ops = {
> +	.open = ima_ascii_measurements_staged_open,
> +	.read = seq_read,
> +	.write = ima_measurements_staged_write,
>  	.llseek = seq_lseek,
>  	.release = ima_measurements_release,
>  };
> @@ -459,10 +576,20 @@ static const struct seq_operations ima_policy_seqops = {
>  };
>  #endif
>  
> -static int __init create_securityfs_measurement_lists(void)
> +static int __init create_securityfs_measurement_lists(bool staging)
>  {
> +	const struct file_operations *ascii_ops = &ima_ascii_measurements_ops;
> +	const struct file_operations *binary_ops = &ima_measurements_ops;
> +	mode_t permissions = (S_IRUSR | S_IRGRP | S_IWUSR | S_IWGRP);
> +	const char *file_suffix = "";
>  	int count = NR_BANKS(ima_tpm_chip);
>  
> +	if (staging) {
> +		ascii_ops = &ima_ascii_measurements_staged_ops;
> +		binary_ops = &ima_measurements_staged_ops;
> +		file_suffix = "_staged";
> +	}
> +
>  	if (ima_sha1_idx >= NR_BANKS(ima_tpm_chip))
>  		count++;
>  
> @@ -473,29 +600,32 @@ static int __init create_securityfs_measurement_lists(void)
>  
>  		if (algo == HASH_ALGO__LAST)
>  			snprintf(file_name, sizeof(file_name),
> -				 "ascii_runtime_measurements_tpm_alg_%x",
> -				 ima_tpm_chip->allocated_banks[i].alg_id);
> +				 "ascii_runtime_measurements_tpm_alg_%x%s",
> +				 ima_tpm_chip->allocated_banks[i].alg_id,
> +				 file_suffix);
>  		else
>  			snprintf(file_name, sizeof(file_name),
> -				 "ascii_runtime_measurements_%s",
> -				 hash_algo_name[algo]);
> -		dentry = securityfs_create_file(file_name, S_IRUSR | S_IRGRP,
> +				 "ascii_runtime_measurements_%s%s",
> +				 hash_algo_name[algo], file_suffix);
> +		dentry = securityfs_create_file(file_name, permissions,
>  						ima_dir, (void *)(uintptr_t)i,
> -						&ima_ascii_measurements_ops);
> +						ascii_ops);
>  		if (IS_ERR(dentry))
>  			return PTR_ERR(dentry);
>  
>  		if (algo == HASH_ALGO__LAST)
>  			snprintf(file_name, sizeof(file_name),
> -				 "binary_runtime_measurements_tpm_alg_%x",
> -				 ima_tpm_chip->allocated_banks[i].alg_id);
> +				 "binary_runtime_measurements_tpm_alg_%x%s",
> +				 ima_tpm_chip->allocated_banks[i].alg_id,
> +				 file_suffix);
>  		else
>  			snprintf(file_name, sizeof(file_name),
> -				 "binary_runtime_measurements_%s",
> -				 hash_algo_name[algo]);
> -		dentry = securityfs_create_file(file_name, S_IRUSR | S_IRGRP,
> +				 "binary_runtime_measurements_%s%s",
> +				 hash_algo_name[algo], file_suffix);
> +
> +		dentry = securityfs_create_file(file_name, permissions,
>  						ima_dir, (void *)(uintptr_t)i,
> -						&ima_measurements_ops);
> +						binary_ops);
>  		if (IS_ERR(dentry))
>  			return PTR_ERR(dentry);
>  	}
> @@ -503,6 +633,23 @@ static int __init create_securityfs_measurement_lists(void)
>  	return 0;
>  }
>  
> +static int __init create_securityfs_staging_links(void)
> +{
> +	struct dentry *dentry;
> +
> +	dentry = securityfs_create_symlink("binary_runtime_measurements_staged",
> +		ima_dir, "binary_runtime_measurements_sha1_staged", NULL);
> +	if (IS_ERR(dentry))
> +		return PTR_ERR(dentry);
> +
> +	dentry = securityfs_create_symlink("ascii_runtime_measurements_staged",
> +		ima_dir, "ascii_runtime_measurements_sha1_staged", NULL);
> +	if (IS_ERR(dentry))
> +		return PTR_ERR(dentry);
> +
> +	return 0;
> +}
> +
>  /*
>   * ima_open_policy: sequentialize access to the policy file
>   */
> @@ -595,7 +742,13 @@ int __init ima_fs_init(void)
>  		goto out;
>  	}
>  
> -	ret = create_securityfs_measurement_lists();
> +	ret = create_securityfs_measurement_lists(false);
> +	if (ret == 0 && IS_ENABLED(CONFIG_IMA_STAGING)) {
> +		ret = create_securityfs_measurement_lists(true);
> +		if (ret == 0)
> +			ret = create_securityfs_staging_links();
> +	}
> +
>  	if (ret != 0)
>  		goto out;
>  
> diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
> index d7d0fb639d99..064cfce0c318 100644
> --- a/security/integrity/ima/ima_kexec.c
> +++ b/security/integrity/ima/ima_kexec.c
> @@ -42,8 +42,8 @@ void ima_measure_kexec_event(const char *event_name)
>  	long len;
>  	int n;
>  
> -	buf_size = ima_get_binary_runtime_size(BINARY);
> -	len = atomic_long_read(&ima_num_entries[BINARY]);
> +	buf_size = ima_get_binary_runtime_size(BINARY_FULL);
> +	len = atomic_long_read(&ima_num_entries[BINARY_FULL]);
>  
>  	n = scnprintf(ima_kexec_event, IMA_KEXEC_EVENT_LEN,
>  		      "kexec_segment_size=%lu;ima_binary_runtime_size=%lu;"
> @@ -106,13 +106,28 @@ static int ima_dump_measurement_list(unsigned long *buffer_size, void **buffer,
>  
>  	memset(&khdr, 0, sizeof(khdr));
>  	khdr.version = 1;
> -	/* This is an append-only list, no need to hold the RCU read lock */
> -	list_for_each_entry_rcu(qe, &ima_measurements, later, true) {
> +	/*
> +	 * It can race with ima_queue_stage() and ima_queue_staged_delete_all().
> +	 */
> +	mutex_lock(&ima_extend_list_mutex);
> +
> +	list_for_each_entry_rcu(qe, &ima_measurements_staged, later,
> +				lockdep_is_held(&ima_extend_list_mutex)) {
>  		ret = ima_dump_measurement(&khdr, qe);
>  		if (ret < 0)
>  			break;
>  	}
>  
> +	list_for_each_entry_rcu(qe, &ima_measurements, later,
> +				lockdep_is_held(&ima_extend_list_mutex)) {
> +		if (!ret)
> +			ret = ima_dump_measurement(&khdr, qe);
> +		if (ret < 0)
> +			break;
> +	}
> +
> +	mutex_unlock(&ima_extend_list_mutex);
> +
>  	/*
>  	 * fill in reserved space with some buffer details
>  	 * (eg. version, buffer size, number of measurements)
> @@ -167,6 +182,7 @@ void ima_add_kexec_buffer(struct kimage *image)
>  		extra_memory = CONFIG_IMA_KEXEC_EXTRA_MEMORY_KB * 1024;
>  
>  	binary_runtime_size = ima_get_binary_runtime_size(BINARY) +
> +			      ima_get_binary_runtime_size(BINARY_STAGED) +
>  			      extra_memory;
>  
>  	if (binary_runtime_size >= ULONG_MAX - PAGE_SIZE)
> diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
> index b6d10dceb669..50519ed837d4 100644
> --- a/security/integrity/ima/ima_queue.c
> +++ b/security/integrity/ima/ima_queue.c
> @@ -26,6 +26,7 @@
>  static struct tpm_digest *digests;
>  
>  LIST_HEAD(ima_measurements);	/* list of all measurements */
> +LIST_HEAD(ima_measurements_staged); /* list of staged measurements */
>  #ifdef CONFIG_IMA_KEXEC
>  static unsigned long binary_runtime_size[BINARY__LAST];
>  #else
> @@ -45,11 +46,11 @@ atomic_long_t ima_num_violations = ATOMIC_LONG_INIT(0);
>  /* key: inode (before secure-hashing a file) */
>  struct hlist_head __rcu *ima_htable;
>  
> -/* mutex protects atomicity of extending measurement list
> +/* mutex protects atomicity of extending and staging measurement list
>   * and extending the TPM PCR aggregate. Since tpm_extend can take
>   * long (and the tpm driver uses a mutex), we can't use the spinlock.
>   */
> -static DEFINE_MUTEX(ima_extend_list_mutex);
> +DEFINE_MUTEX(ima_extend_list_mutex);
>  
>  /*
>   * Used internally by the kernel to suspend measurements.
> @@ -174,12 +175,16 @@ static int ima_add_digest_entry(struct ima_template_entry *entry,
>  				lockdep_is_held(&ima_extend_list_mutex));
>  
>  	atomic_long_inc(&ima_num_entries[BINARY]);
> +	atomic_long_inc(&ima_num_entries[BINARY_FULL]);
> +
>  	if (update_htable) {
>  		key = ima_hash_key(entry->digests[ima_hash_algo_idx].digest);
>  		hlist_add_head_rcu(&qe->hnext, &htable[key]);
>  	}
>  
>  	ima_update_binary_runtime_size(entry, BINARY);
> +	ima_update_binary_runtime_size(entry, BINARY_FULL);
> +
>  	return 0;
>  }
>  
> @@ -280,6 +285,94 @@ int ima_add_template_entry(struct ima_template_entry *entry, int violation,
>  	return result;
>  }
>  
> +int ima_queue_stage(void)
> +{
> +	int ret = 0;
> +
> +	mutex_lock(&ima_extend_list_mutex);
> +	if (!list_empty(&ima_measurements_staged)) {
> +		ret = -EEXIST;
> +		goto out_unlock;
> +	}
> +
> +	if (list_empty(&ima_measurements)) {
> +		ret = -ENOENT;
> +		goto out_unlock;
> +	}
> +
> +	list_replace(&ima_measurements, &ima_measurements_staged);
> +	INIT_LIST_HEAD(&ima_measurements);
> +
> +	atomic_long_set(&ima_num_entries[BINARY_STAGED],
> +			atomic_long_read(&ima_num_entries[BINARY]));
> +	atomic_long_set(&ima_num_entries[BINARY], 0);
> +
> +	if (IS_ENABLED(CONFIG_IMA_KEXEC)) {
> +		binary_runtime_size[BINARY_STAGED] =
> +					binary_runtime_size[BINARY];
> +		binary_runtime_size[BINARY] = 0;
> +	}
> +out_unlock:
> +	mutex_unlock(&ima_extend_list_mutex);
> +	return ret;
> +}
> +
> +static void ima_queue_delete(struct list_head *head);
> +
> +int ima_queue_staged_delete_all(void)
> +{
> +	LIST_HEAD(ima_measurements_trim);
> +
> +	mutex_lock(&ima_extend_list_mutex);
> +	if (list_empty(&ima_measurements_staged)) {
> +		mutex_unlock(&ima_extend_list_mutex);
> +		return -ENOENT;
> +	}
> +
> +	list_replace(&ima_measurements_staged, &ima_measurements_trim);
> +	INIT_LIST_HEAD(&ima_measurements_staged);
> +
> +	atomic_long_set(&ima_num_entries[BINARY_STAGED], 0);
> +
> +	if (IS_ENABLED(CONFIG_IMA_KEXEC))
> +		binary_runtime_size[BINARY_STAGED] = 0;
> +
> +	mutex_unlock(&ima_extend_list_mutex);
> +
> +	ima_queue_delete(&ima_measurements_trim);
> +	return 0;
> +}
> +
> +static void ima_queue_delete(struct list_head *head)
> +{
> +	struct ima_queue_entry *qe, *qe_tmp;
> +	unsigned int i;
> +
> +	list_for_each_entry_safe(qe, qe_tmp, head, later) {
> +		/*
> +		 * Safe to free template_data here without synchronize_rcu()
> +		 * because the only htable reader, ima_lookup_digest_entry(),
> +		 * accesses only entry->digests, not template_data. If new
> +		 * htable readers are added that access template_data, a
> +		 * synchronize_rcu() is required here.
> +		 */
> +		for (i = 0; i < qe->entry->template_desc->num_fields; i++) {
> +			kfree(qe->entry->template_data[i].data);
> +			qe->entry->template_data[i].data = NULL;
> +			qe->entry->template_data[i].len = 0;
> +		}
> +
> +		list_del(&qe->later);
> +
> +		/* No leak if condition is false, referenced by ima_htable. */
> +		if (IS_ENABLED(CONFIG_IMA_DISABLE_HTABLE)) {
> +			kfree(qe->entry->digests);
> +			kfree(qe->entry);
> +			kfree(qe);
> +		}
> +	}
> +}
> +
>  int ima_restore_measurement_entry(struct ima_template_entry *entry)
>  {
>  	int result = 0;


^ permalink raw reply

* Re: [PATCH 1/3] apparmor: Fix return in ns_mkdir_op
From: Ryan Lee @ 2026-05-04 18:22 UTC (permalink / raw)
  To: Hongling Zeng
  Cc: john.johansen, paul, jmorris, serge, neil, brauner, jlayton, jack,
	apparmor, linux-security-module, linux-kernel, zhongling0719
In-Reply-To: <20260503041243.200895-1-zenghongling@kylinos.cn>

On Sat, May 2, 2026 at 9:13 PM Hongling Zeng <zenghongling@kylinos.cn> wrote:
>
> Return NULL instead of passing to ERR_PTR while error is zero.
>   Fixes smatch warning:
>     - security/apparmor/apparmorfs.c:1846 ns_mkdir_op() warn:
>       passing zero to 'ERR_PTR'
>
> Fixes: 88d5baf69082 ("Change inode_operations.mkdir to return struct dentry *")
> Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn>
> ---
>  security/apparmor/apparmorfs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
> index ededaf46f3ca..1d7b1c70f22a 100644
> --- a/security/apparmor/apparmorfs.c
> +++ b/security/apparmor/apparmorfs.c
> @@ -1922,7 +1922,7 @@ static struct dentry *ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir,
>         mutex_unlock(&parent->lock);
>         aa_put_ns(parent);
>
> -       return ERR_PTR(error);
> +       return error ? ERR_PTR(error) : NULL;
>  }
>
>  static int ns_rmdir_op(struct inode *dir, struct dentry *dentry)
> --
> 2.25.1
>
>

Reviewed-by: Ryan Lee <ryan.lee@canonical.com>

^ permalink raw reply

* Re: [PATCH v2 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: Paul Moore @ 2026-05-04 20:14 UTC (permalink / raw)
  To: David Windsor
  Cc: Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman,
	Kumar Kartikeya Dwivedi, KP Singh, Matt Bobrowski, James Morris,
	Serge E. Hallyn, Mimi Zohar, Roberto Sassu, Dmitry Kasatkin,
	Stephen Smalley, Casey Schaufler, Song Liu, Jan Kara,
	John Fastabend, Martin KaFai Lau, Yonghong Song, Jiri Olsa,
	Eric Snowberg, Ondrej Mosnacek, linux-fsdevel, linux-kernel, bpf,
	linux-security-module, linux-integrity, selinux
In-Reply-To: <20260503211835.16103-2-dwindsor@gmail.com>

On Sun, May 3, 2026 at 5:18 PM David Windsor <dwindsor@gmail.com> wrote:
>
> Add bpf_init_inode_xattr() kfunc for BPF LSM programs to atomically set
> xattrs via the inode_init_security hook using lsm_get_xattr_slot().
>
> The inode_init_security hook previously took the xattr array and count
> as two separate output parameters (struct xattr *xattrs, int
> *xattr_count), which BPF programs cannot write to. Pass the xattr state
> as a single context object (struct lsm_xattr_ctx) instead, and have
> bpf_init_inode_xattr() take that context directly. Update the existing
> in-tree callers of inode_init_security to take and forward the new
> lsm_xattr_ctx.
>
> Because we rely on the hook-specific ctx layout, the kfunc is
> restricted to lsm/inode_init_security. Restrict the xattr names that
> may be set via this kfunc to the bpf.* namespace.
>
> Suggested-by: Song Liu <song@kernel.org>
> Signed-off-by: David Windsor <dwindsor@gmail.com>
> ---
>  fs/bpf_fs_kfuncs.c                | 106 +++++++++++++++++++++++++++++-
>  include/linux/bpf_lsm.h           |   3 +
>  include/linux/evm.h               |   9 +--
>  include/linux/lsm_hook_defs.h     |   4 +-
>  include/linux/lsm_hooks.h         |  16 ++---
>  include/linux/security.h          |   5 ++
>  kernel/bpf/bpf_lsm.c              |   1 +
>  security/bpf/hooks.c              |   1 +
>  security/integrity/evm/evm_main.c |   8 ++-
>  security/security.c               |   7 +-
>  security/selinux/hooks.c          |   4 +-
>  security/smack/smack_lsm.c        |  13 ++--
>  12 files changed, 147 insertions(+), 30 deletions(-)

Comments below ...

> diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
> index 9d27be058494..193accc00796 100644
> --- a/fs/bpf_fs_kfuncs.c
> +++ b/fs/bpf_fs_kfuncs.c
> @@ -10,6 +10,7 @@
>  #include <linux/fsnotify.h>
>  #include <linux/file.h>
>  #include <linux/kernfs.h>
> +#include <linux/lsm_hooks.h>
>  #include <linux/mm.h>
>  #include <linux/xattr.h>
>
> @@ -353,6 +354,97 @@ __bpf_kfunc int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__s
>  }
>  #endif /* CONFIG_CGROUPS */
>
> +static int bpf_xattrs_used(const struct lsm_xattr_ctx *ctx)
> +{
> +       const size_t prefix_len = sizeof(XATTR_BPF_LSM_SUFFIX) - 1;
> +       int i, n = 0;
> +
> +       for (i = 0; i < *ctx->xattr_count; i++) {
> +               const char *name = ctx->xattrs[i].name;
> +
> +               if (name && !strncmp(name, XATTR_BPF_LSM_SUFFIX, prefix_len))
> +                       n++;
> +       }
> +       return n;
> +}
> +
> +static int __bpf_init_inode_xattr(struct lsm_xattr_ctx *xattr_ctx,
> +                                 const char *name__str,
> +                                 const struct bpf_dynptr *value_p)
> +{
> +       struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
> +       size_t name_len;
> +       void *xattr_value;
> +       struct xattr *xattr;
> +       struct xattr *xattrs;
> +       int *xattr_count;
> +       const void *value;
> +       u32 value_len;
> +
> +       if (!xattr_ctx || !name__str)
> +               return -EINVAL;
> +
> +       xattrs = xattr_ctx->xattrs;
> +       xattr_count = xattr_ctx->xattr_count;
> +       if (!xattrs || !xattr_count)
> +               return -EINVAL;
> +       if (bpf_xattrs_used(xattr_ctx) >= BPF_LSM_INODE_INIT_XATTRS)
> +               return -ENOSPC;
> +
> +       name_len = strlen(name__str);
> +       if (name_len == 0 || name_len > XATTR_NAME_MAX)
> +               return -EINVAL;
> +       if (strncmp(name__str, XATTR_BPF_LSM_SUFFIX,
> +                   sizeof(XATTR_BPF_LSM_SUFFIX) - 1))
> +               return -EPERM;
> +
> +       value_len = __bpf_dynptr_size(value_ptr);
> +       if (value_len == 0 || value_len > XATTR_SIZE_MAX)
> +               return -EINVAL;
> +
> +       value = __bpf_dynptr_data(value_ptr, value_len);
> +       if (!value)
> +               return -EINVAL;
> +
> +       /* Combine xattr value + name into one allocation. */
> +       xattr_value = kmalloc(value_len + name_len + 1, GFP_KERNEL);
> +       if (!xattr_value)
> +               return -ENOMEM;
> +
> +       memcpy(xattr_value, value, value_len);
> +       memcpy(xattr_value + value_len, name__str, name_len);
> +       ((char *)xattr_value)[value_len + name_len] = '\0';
> +
> +       xattr = lsm_get_xattr_slot(xattr_ctx);
> +       if (!xattr) {
> +               kfree(xattr_value);
> +               return -ENOSPC;
> +       }
> +
> +       xattr->value = xattr_value;
> +       xattr->name = (const char *)xattr_value + value_len;
> +       xattr->value_len = value_len;
> +
> +       return 0;
> +}
> +
> +/**
> + * bpf_init_inode_xattr - set an xattr on a new inode from inode_init_security
> + * @xattr_ctx: inode_init_security xattr state from the hook context
> + * @name__str: xattr name (e.g., "bpf.file_label")
> + * @value_p: dynptr containing the xattr value
> + *
> + * Only callable from lsm/inode_init_security programs.
> + *
> + * Return: 0 on success, negative error on failure.
> + */
> +__bpf_kfunc int bpf_init_inode_xattr(struct lsm_xattr_ctx *xattr_ctx,
> +                                    const char *name__str,
> +                                    const struct bpf_dynptr *value_p)
> +{
> +       return __bpf_init_inode_xattr(xattr_ctx, name__str, value_p);
> +}
> +
>  __bpf_kfunc_end_defs();
>
>  BTF_KFUNCS_START(bpf_fs_kfunc_set_ids)
> @@ -363,13 +455,25 @@ BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE)
>  BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE)
>  BTF_ID_FLAGS(func, bpf_set_dentry_xattr, KF_SLEEPABLE)
>  BTF_ID_FLAGS(func, bpf_remove_dentry_xattr, KF_SLEEPABLE)
> +BTF_ID_FLAGS(func, bpf_init_inode_xattr, KF_SLEEPABLE)
>  BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)
>
> +BTF_ID_LIST(bpf_lsm_inode_init_security_btf_ids)
> +BTF_ID(func, bpf_lsm_inode_init_security)
> +
> +BTF_ID_LIST(bpf_init_inode_xattr_btf_ids)
> +BTF_ID(func, bpf_init_inode_xattr)
> +
>  static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
>  {
>         if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
> -           prog->type == BPF_PROG_TYPE_LSM)
> +           prog->type == BPF_PROG_TYPE_LSM) {
> +               /* bpf_init_inode_xattr only attaches to inode_init_security. */
> +               if (kfunc_id == bpf_init_inode_xattr_btf_ids[0] &&
> +                   prog->aux->attach_btf_id != bpf_lsm_inode_init_security_btf_ids[0])
> +                       return -EACCES;
>                 return 0;
> +       }
>         return -EACCES;
>  }

Perhaps I'm simply not seeing it, but is there a check to ensure that
there is only one BPF LSM calling into security_inode_init_security()
at any given time?  With the BPF LSM only reserving a single xattr
slot, multiple loaded BPF LSM programs providing
security_inode_init_security() callbacks will be a problem.

> diff --git a/include/linux/security.h b/include/linux/security.h
> index 41d7367cf403..a2fc72e63ada 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -68,6 +68,11 @@ struct watch;
>  struct watch_notification;
>  struct lsm_ctx;
>
> +struct lsm_xattr_ctx {
> +       struct xattr *xattrs;
> +       int *xattr_count;
> +};

I'd prefer this to be simply "struct lsm_xattrs" as "ctx" is an
overloaded term in the LSM space.

> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 97801966bf32..dca81a22bf83 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -2962,11 +2962,11 @@ static int selinux_dentry_create_files_as(struct dentry *dentry, int mode,
>
>  static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
>                                        const struct qstr *qstr,
> -                                      struct xattr *xattrs, int *xattr_count)
> +                                      struct lsm_xattr_ctx *xattr_ctx)
>  {
>         const struct cred_security_struct *crsec = selinux_cred(current_cred());
>         struct superblock_security_struct *sbsec;
> -       struct xattr *xattr = lsm_get_xattr_slot(xattrs, xattr_count);
> +       struct xattr *xattr = lsm_get_xattr_slot(xattr_ctx);
>         u32 newsid, clen;
>         u16 newsclass;
>         int rc;

In case you didn't see it, your fix for the above lsm_get_xattr_slot()
usage is now in Linus' tree.  It's a trivial bit of merge fuzz, but
you might want to rebase your next revision.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH] ima: debugging late_initcall_sync measurements
From: Paul Moore @ 2026-05-04 20:51 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Yeoreum Yun, Jonathan McDowell, linux-security-module,
	linux-kernel, linux-integrity, linux-arm-kernel, kvmarm, jmorris,
	serge, roberto.sassu, dmitry.kasatkin, eric.snowberg, jarkko, jgg,
	sudeep.holla, maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, noodles, sebastianene
In-Reply-To: <ff28c6dcb60c357c752724927addaa8c4fd3bf2c.camel@linux.ibm.com>

On Mon, May 4, 2026 at 8:03 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> On Sun, 2026-05-03 at 12:46 -0400, Paul Moore wrote:
> > Regardless, assuming you always want IMA to leverage a TPMs when they
> > exist, your reply suggests that using an initcall based IMA init
> > scheme, even a late-sync initcall, may not be sufficient because
> > deferred TPM initialization could happen later, yes?
>
> Well yeah.  The TPM could be configured as a module, but that scenario is not of
> interest.  That's way too late.  The case being addressed in this patch set is
> when the TPM driver tries to initialize at device_initcall, returns
> EPROBE_DEFER, and is retried at deferred_probe_initcall (late_initcall).  Since
> ordering within an initcall is not supported, this patch attempts to initialize
> IMA at late_initcall and similarly retries, in this case, at late_initcall_sync.

Okay, so from a TPM initialization perspective you are satisfied with
a late-sync IMA initialization, yes?

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v2 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: Song Liu @ 2026-05-04 21:40 UTC (permalink / raw)
  To: Paul Moore
  Cc: David Windsor, Alexander Viro, Christian Brauner,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, KP Singh,
	Matt Bobrowski, James Morris, Serge E. Hallyn, Mimi Zohar,
	Roberto Sassu, Dmitry Kasatkin, Stephen Smalley, Casey Schaufler,
	Jan Kara, John Fastabend, Martin KaFai Lau, Yonghong Song,
	Jiri Olsa, Eric Snowberg, Ondrej Mosnacek, linux-fsdevel,
	linux-kernel, bpf, linux-security-module, linux-integrity,
	selinux
In-Reply-To: <CAHC9VhSy5K5nQTtFUE4BScy1Ur61v7eZW067vTcUYDQeJb13Bw@mail.gmail.com>

On Mon, May 4, 2026 at 10:14 PM Paul Moore <paul@paul-moore.com> wrote:
[...]
> > diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
> > index 9d27be058494..193accc00796 100644
> > --- a/fs/bpf_fs_kfuncs.c
> > +++ b/fs/bpf_fs_kfuncs.c
> > @@ -10,6 +10,7 @@
> >  #include <linux/fsnotify.h>
> >  #include <linux/file.h>
> >  #include <linux/kernfs.h>
> > +#include <linux/lsm_hooks.h>
> >  #include <linux/mm.h>
> >  #include <linux/xattr.h>
> >
> > @@ -353,6 +354,97 @@ __bpf_kfunc int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__s
> >  }
> >  #endif /* CONFIG_CGROUPS */
> >
> > +static int bpf_xattrs_used(const struct lsm_xattr_ctx *ctx)
> > +{
> > +       const size_t prefix_len = sizeof(XATTR_BPF_LSM_SUFFIX) - 1;
> > +       int i, n = 0;
> > +
> > +       for (i = 0; i < *ctx->xattr_count; i++) {
> > +               const char *name = ctx->xattrs[i].name;
> > +
> > +               if (name && !strncmp(name, XATTR_BPF_LSM_SUFFIX, prefix_len))
> > +                       n++;
> > +       }
> > +       return n;
> > +}
[...]
> > +
> >  static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
> >  {
> >         if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
> > -           prog->type == BPF_PROG_TYPE_LSM)
> > +           prog->type == BPF_PROG_TYPE_LSM) {
> > +               /* bpf_init_inode_xattr only attaches to inode_init_security. */
> > +               if (kfunc_id == bpf_init_inode_xattr_btf_ids[0] &&
> > +                   prog->aux->attach_btf_id != bpf_lsm_inode_init_security_btf_ids[0])
> > +                       return -EACCES;

We need to mark bpf_init_inode_xattr with KF_RCU (requires a trusted
pointer), then we can remove this check above.

> >                 return 0;
> > +       }
> >         return -EACCES;
> >  }
>
> Perhaps I'm simply not seeing it, but is there a check to ensure that
> there is only one BPF LSM calling into security_inode_init_security()
> at any given time?  With the BPF LSM only reserving a single xattr
> slot, multiple loaded BPF LSM programs providing
> security_inode_init_security() callbacks will be a problem.

I don't think there is such a check. Also, a single BPF LSM function
may call the kfunc multiple times, which is also problematic.

I think we will need to make the default bigger, and also introduce
some realloc mechanism for the worst case scenario. This should
work, but the code might be a bit messy.

Thanks,
Song

>
> > diff --git a/include/linux/security.h b/include/linux/security.h
> > index 41d7367cf403..a2fc72e63ada 100644
> > --- a/include/linux/security.h
> > +++ b/include/linux/security.h
> > @@ -68,6 +68,11 @@ struct watch;
> >  struct watch_notification;
> >  struct lsm_ctx;
> >
[...]

^ permalink raw reply

* Re: [PATCH v2 0/4] Firmware LSM hook
From: Paul Moore @ 2026-05-04 22:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Roberto Sassu, KP Singh, Matt Bobrowski,
	Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Saeed Mahameed, Itay Avraham, Dave Jiang, Jonathan Cameron, bpf,
	linux-kernel, linux-kselftest, linux-rdma, Chiara Meiohas,
	Maher Sanalla, linux-security-module
In-Reply-To: <20260424221310.GA804026@ziepe.ca>

On Fri, Apr 24, 2026 at 6:13 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> ... I wonder if we are even speaking the same language.

Let's reset the conversation.

As I understand it, based on our discussion in this thread and Leon's
previous patchsets, the basic idea is to enable LSMs to enforce access
control over fwctl requests/commands sent from userspace.  I'm going
to start with that as a basis.

Using the kernel's docs on fwctl, the userspace API appears to consist
mostly of ioctls with some basic sysfs interfaces.  It looks like we
can mostly ignore the sysfs interface and focus on the ioctl side of
the API, do you agree?

https://docs.kernel.org/userspace-api/fwctl/fwctl.html

While normally I would suggest simply using the existing
security_file_ioctl() hook, Leon previously mentioned that the hook is
too early for fwctl as the userspace copy happens much later.  Looking
at the code, it appears that the copy happens in fwctl_fops_ioctl()
for all fwctl ioctls regardless of the device or ioctl, is that
correct?

Assuming the above is correct, how about the following LSM hook,
called after the copy_struct_from_user() in fwctl_fops_ioctl()?

 union fwctl_data {
   struct fwctl_info info;
   struct fwctl_rpc rpc;
 }

 int security_fwctl_ioctl(struct file *filep, unsigned int cmd, union
fwctl_data *arg)

Where @filep is the file/device being sent the ioctl, @cmd is the
ioctl command number (e.g. FWCTL_RPC), and @arg is the copied ioctl
data (e.g. ucmd.cmd in fwctl_fops_ioctl).  In addition to applying
access controls based on the ioctl command number, a capability that
already exists via the security_file_ioctl() hook, LSMs could also
apply access controls based on the RPC scope as well as any other well
defined data in the ioctl payload.

I expect most of the existing LSMs would implement callbacks for this
new hook with the subject being the process submitting the ioctl, the
object being the file/device that is being operated on with the
ioctl() call, and the access/privilege/verb/etc. being something along
the lines of INFO, RPC_CONFIG, RPC_DEBUG_READ, RPC_DEBUG_WRITE, or
RPC_DEBUG_WRITE_FULL.  Of course these are just quick examples to
demonstrate a point, please don't take those names as hard
requirements.  Each LSM is free to characterize the access request
however they like, in a way that best aligns with their security
model.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v2 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: Paul Moore @ 2026-05-04 22:42 UTC (permalink / raw)
  To: Song Liu
  Cc: David Windsor, Alexander Viro, Christian Brauner,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, KP Singh,
	Matt Bobrowski, James Morris, Serge E. Hallyn, Mimi Zohar,
	Roberto Sassu, Dmitry Kasatkin, Stephen Smalley, Casey Schaufler,
	Jan Kara, John Fastabend, Martin KaFai Lau, Yonghong Song,
	Jiri Olsa, Eric Snowberg, Ondrej Mosnacek, linux-fsdevel,
	linux-kernel, bpf, linux-security-module, linux-integrity,
	selinux
In-Reply-To: <CAPhsuW6sy2cdC4B7Z48-5A-yVX6fmVWxS_fWVjQxiX95KeUguw@mail.gmail.com>

On Mon, May 4, 2026 at 5:40 PM Song Liu <song@kernel.org> wrote:
> On Mon, May 4, 2026 at 10:14 PM Paul Moore <paul@paul-moore.com> wrote:
> [...]
> > > diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
> > > index 9d27be058494..193accc00796 100644
> > > --- a/fs/bpf_fs_kfuncs.c
> > > +++ b/fs/bpf_fs_kfuncs.c
> > > @@ -10,6 +10,7 @@
> > >  #include <linux/fsnotify.h>
> > >  #include <linux/file.h>
> > >  #include <linux/kernfs.h>
> > > +#include <linux/lsm_hooks.h>
> > >  #include <linux/mm.h>
> > >  #include <linux/xattr.h>
> > >
> > > @@ -353,6 +354,97 @@ __bpf_kfunc int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__s
> > >  }
> > >  #endif /* CONFIG_CGROUPS */
> > >
> > > +static int bpf_xattrs_used(const struct lsm_xattr_ctx *ctx)
> > > +{
> > > +       const size_t prefix_len = sizeof(XATTR_BPF_LSM_SUFFIX) - 1;
> > > +       int i, n = 0;
> > > +
> > > +       for (i = 0; i < *ctx->xattr_count; i++) {
> > > +               const char *name = ctx->xattrs[i].name;
> > > +
> > > +               if (name && !strncmp(name, XATTR_BPF_LSM_SUFFIX, prefix_len))
> > > +                       n++;
> > > +       }
> > > +       return n;
> > > +}
> [...]
> > > +
> > >  static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
> > >  {
> > >         if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
> > > -           prog->type == BPF_PROG_TYPE_LSM)
> > > +           prog->type == BPF_PROG_TYPE_LSM) {
> > > +               /* bpf_init_inode_xattr only attaches to inode_init_security. */
> > > +               if (kfunc_id == bpf_init_inode_xattr_btf_ids[0] &&
> > > +                   prog->aux->attach_btf_id != bpf_lsm_inode_init_security_btf_ids[0])
> > > +                       return -EACCES;
>
> We need to mark bpf_init_inode_xattr with KF_RCU (requires a trusted
> pointer), then we can remove this check above.
>
> > >                 return 0;
> > > +       }
> > >         return -EACCES;
> > >  }
> >
> > Perhaps I'm simply not seeing it, but is there a check to ensure that
> > there is only one BPF LSM calling into security_inode_init_security()
> > at any given time?  With the BPF LSM only reserving a single xattr
> > slot, multiple loaded BPF LSM programs providing
> > security_inode_init_security() callbacks will be a problem.
>
> I don't think there is such a check. Also, a single BPF LSM function
> may call the kfunc multiple times, which is also problematic.
>
> I think we will need to make the default bigger, and also introduce
> some realloc mechanism for the worst case scenario. This should
> work, but the code might be a bit messy.

Thanks for the clarification, that is what I was afraid of when
looking at the code, but I was hoping I was just missing it.

Increasing the default is an option, but I don't think we want to
support a dynamic reallocation scheme for the xattr slots, that will
likely get extremely messy with synchronization between the LSM
framework and BPF LSM hook registrations as well as special code to
handle inodes with lifetimes that are disjoint from the BPF LSM
programs ... I suppose there may be a way to do it, but it will surely
be ugly and come at a cost.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v2 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: Song Liu @ 2026-05-04 23:09 UTC (permalink / raw)
  To: Paul Moore
  Cc: David Windsor, Alexander Viro, Christian Brauner,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, KP Singh,
	Matt Bobrowski, James Morris, Serge E. Hallyn, Mimi Zohar,
	Roberto Sassu, Dmitry Kasatkin, Stephen Smalley, Casey Schaufler,
	Jan Kara, John Fastabend, Martin KaFai Lau, Yonghong Song,
	Jiri Olsa, Eric Snowberg, Ondrej Mosnacek, linux-fsdevel,
	linux-kernel, bpf, linux-security-module, linux-integrity,
	selinux
In-Reply-To: <CAHC9VhQLN5NA_ZMMNyUdMCZVdwC3VM4PUnzka8xDK5rpR2a3sw@mail.gmail.com>

On Tue, May 5, 2026 at 12:42 AM Paul Moore <paul@paul-moore.com> wrote:
[...]
> > > Perhaps I'm simply not seeing it, but is there a check to ensure that
> > > there is only one BPF LSM calling into security_inode_init_security()
> > > at any given time?  With the BPF LSM only reserving a single xattr
> > > slot, multiple loaded BPF LSM programs providing
> > > security_inode_init_security() callbacks will be a problem.
> >
> > I don't think there is such a check. Also, a single BPF LSM function
> > may call the kfunc multiple times, which is also problematic.
> >
> > I think we will need to make the default bigger, and also introduce
> > some realloc mechanism for the worst case scenario. This should
> > work, but the code might be a bit messy.
>
> Thanks for the clarification, that is what I was afraid of when
> looking at the code, but I was hoping I was just missing it.
>
> Increasing the default is an option, but I don't think we want to
> support a dynamic reallocation scheme for the xattr slots, that will
> likely get extremely messy with synchronization between the LSM
> framework and BPF LSM hook registrations as well as special code to
> handle inodes with lifetimes that are disjoint from the BPF LSM
> programs ... I suppose there may be a way to do it, but it will surely
> be ugly and come at a cost.

BPF trampoline already handles all the synchronizations, such as
add hook, remove hook, etc. properly. So this is not that hard.
All we really need is to allocate a new array, copy pointers, and free
the old array. And we only really need this in the worst case
scenarios.

Thanks,
Song

^ permalink raw reply

* Re: [v6 10/10] ipe: Add BPF program load policy enforcement via Hornet integration
From: Fan Wu @ 2026-05-04 23:52 UTC (permalink / raw)
  To: Blaise Boscaccy
  Cc: Jonathan Corbet, Paul Moore, James Morris, Serge E. Hallyn,
	Mickaël Salaün, Günther Noack,
	Dr. David Alan Gilbert, Andrew Morton, James.Bottomley, dhowells,
	Fan Wu, Ryan Foster, Randy Dunlap, linux-security-module,
	linux-doc, linux-kernel, bpf, Song Liu
In-Reply-To: <20260429191431.2345448-11-bboscaccy@linux.microsoft.com>

On Wed, Apr 29, 2026 at 12:15 PM Blaise Boscaccy
<bboscaccy@linux.microsoft.com> wrote:
>
> Add support for the bpf_prog_load_post_integrity LSM hook, enabling IPE
> to make policy decisions about BPF program loading based on integrity
> verdicts provided by the Hornet LSM.
>
> New policy operation:
>   op=BPF_PROG_LOAD - Matches BPF program load events
>
> New policy properties:
>   bpf_signature=NONE      - No Verdict
>   bpf_signature=OK        - Program signature and map hashes verified
>   bpf_signature=UNSIGNED  - No signature provided
>   bpf_signature=PARTIALSIG - Signature OK but no map hash data
>   bpf_signature=UNKNOWNKEY - Cert not trusted

This one should be: The keyring requested by the user is invalid.

>   bpf_signature=UNEXPECTED - An unexpected hash value was encountered
>   bpf_signature=FAULT      - System error during verification
>   bpf_signature=BADSIG    - Signature or map hash verification failed
>   bpf_keyring=BUILTIN     - Program was signed using a builtin keyring
>   bpf_keyring=SECONDARY   - Program was signed using the secondary keyring
>   bpf_keyring=PLATFORM    - Program was signed using the platform keyring
>   bpf_kernel=TRUE         - Program originated from kernelspace
>   bpf_kernel=FALSE        - Program originated from userspace
>
> These properties map directly to the lsm_integrity_verdict enum values
> provided by the Hornet LSM through security_bpf_prog_load_post_integrity.
>
> The feature is gated on CONFIG_IPE_PROP_BPF_SIGNATURE which depends on
> CONFIG_SECURITY_HORNET.
>
> Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
> ---
>  Documentation/admin-guide/LSM/ipe.rst | 162 +++++++++++++++++++++++++-
>  Documentation/security/ipe.rst        |  39 +++++++
>  security/ipe/Kconfig                  |  14 +++
>  security/ipe/audit.c                  |  15 +++
>  security/ipe/eval.c                   |  73 +++++++++++-
>  security/ipe/eval.h                   |  11 ++
>  security/ipe/hooks.c                  |  63 ++++++++++
>  security/ipe/hooks.h                  |  15 +++
>  security/ipe/ipe.c                    |  14 +++
>  security/ipe/ipe.h                    |   3 +
>  security/ipe/policy.h                 |  14 +++
>  security/ipe/policy_parser.c          |  27 +++++
>  12 files changed, 448 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/LSM/ipe.rst b/Documentation/admin-guide/LSM/ipe.rst
> index a756d81585317..4dfbf0d325a8a 100644
> --- a/Documentation/admin-guide/LSM/ipe.rst
> +++ b/Documentation/admin-guide/LSM/ipe.rst
> @@ -559,7 +559,8 @@ policy. Two properties are built-into the policy parser: 'op' and 'action'.
>  The other properties are used to restrict immutable security properties
>  about the files being evaluated. Currently those properties are:
>  '``boot_verified``', '``dmverity_signature``', '``dmverity_roothash``',
> -'``fsverity_signature``', '``fsverity_digest``'. A description of all
> +'``fsverity_signature``', '``fsverity_digest``', '``bpf_signature``',
> +'``bpf_keyring``', '``bpf_kernel``'. A description of all
>  properties supported by IPE are listed below:
>
>  op
> @@ -603,6 +604,14 @@ as the first token. IPE supports the following operations:
>        Controls loading IMA certificates through the Kconfigs,
>        ``CONFIG_IMA_X509_PATH`` and ``CONFIG_EVM_X509_PATH``.
>
> +   ``BPF_PROG_LOAD``:
> +
> +      Pertains to BPF programs being loaded via the ``bpf()`` syscall.
> +      This operation is used in conjunction with the ``bpf_signature``,
> +      ``bpf_keyring``, and ``bpf_kernel`` properties to control BPF
> +      program loading based on integrity verification provided by the
> +      Hornet LSM.
> +
>  action
>  ~~~~~~
>
> @@ -713,6 +722,105 @@ fsverity_signature
>
>        fsverity_signature=(TRUE|FALSE)
>
> +bpf_signature
> +~~~~~~~~~~~~~
> +
> +   This property can be utilized for authorization of BPF program loads based
> +   on the integrity verdict provided by the Hornet LSM. When a BPF program is
> +   loaded, Hornet performs cryptographic verification of the program's PKCS#7
> +   signature (if present) and passes an integrity verdict to IPE via the
> +   ``security_bpf_prog_load_post_integrity`` hook. IPE can then allow or deny
> +   the load based on the verdict.
> +
> +   This property depends on ``SECURITY_HORNET`` and is controlled by the
> +   ``IPE_PROP_BPF_SIGNATURE`` config option.
> +   The format of this property is::
> +
> +      bpf_signature=(NONE|OK|UNSIGNED|PARTIALSIG|UNKNOWNKEY|UNEXPECTED|FAULT|BADSIG)
> +
> +   The possible values correspond to the integrity verdicts from Hornet:
> +
> +      ``NONE``
> +
> +         No integrity verdict was set (default/uninitialized).
> +
> +      ``OK``
> +
> +         The BPF program's signature and all map hashes were successfully
> +         verified.
> +
> +      ``UNSIGNED``
> +
> +         No signature was provided with the BPF program.
> +
> +      ``PARTIALSIG``
> +
> +         The program signature was verified, but no authenticated map hash
> +         data was present.
> +
> +      ``UNKNOWNKEY``
> +
> +         The signing certificate is not trusted by the specified keyring.

Same above.

> +
> +      ``UNEXPECTED``
> +
> +         An unexpected map hash value was encountered during verification.
> +
> +      ``FAULT``
> +
> +         A system error occurred during signature verification.
> +
> +      ``BADSIG``
> +
> +         The signature or hash verification failed.
> +
> +bpf_keyring
> +~~~~~~~~~~~~
> +
> +   This property can be utilized for authorization of BPF program loads based
> +   on the keyring specified in the ``bpf_attr`` during the ``BPF_PROG_LOAD``
> +   syscall. This allows policies to restrict which keyring must be used for
> +   signature verification of BPF programs.
> +
> +   This property shares the ``IPE_PROP_BPF_SIGNATURE`` config option with
> +   ``bpf_signature``.
> +   The format of this property is::
> +
> +      bpf_keyring=(BUILTIN|SECONDARY|PLATFORM)
> +
> +   The possible values correspond to the system keyrings:
> +
> +      ``BUILTIN``
> +
> +         The builtin trusted keyring (``.builtin_trusted_keys``), which
> +         contains keys embedded at kernel compile time.
> +
> +      ``SECONDARY``
> +
> +         The secondary trusted keyring (``.secondary_trusted_keys``), which
> +         includes both builtin trusted keys and keys added at runtime.
> +
> +      ``PLATFORM``
> +
> +         The platform keyring (``.platform``), which contains keys provided
> +         by the platform firmware (e.g. UEFI db keys).
> +
> +bpf_kernel
> +~~~~~~~~~~
> +
> +   This property can be utilized for authorization of BPF program loads based
> +   on whether the load originated from kernel space or user space. The BPF
> +   light skeleton infrastructure performs a secondary kernel-originated program
> +   load that will not carry a signature. This property allows policies to
> +   permit such kernel-originated loads while still requiring signatures for
> +   user-space loads.
> +
> +   This property shares the ``IPE_PROP_BPF_SIGNATURE`` config option with
> +   ``bpf_signature``.
> +   The format of this property is::
> +
> +      bpf_kernel=(TRUE|FALSE)
> +
>  Policy Examples
>  ---------------
>
> @@ -788,6 +896,58 @@ Allow execution of a specific fs-verity file
>
>     op=EXECUTE fsverity_digest=sha256:fd88f2b8824e197f850bf4c5109bea5cf0ee38104f710843bb72da796ba5af9e action=ALLOW
>
> +Allow only signed BPF programs
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +::
> +
> +   policy_name=Allow_Signed_BPF policy_version=0.0.0
> +   DEFAULT action=ALLOW
> +
> +   DEFAULT op=BPF_PROG_LOAD action=DENY
> +   op=BPF_PROG_LOAD bpf_kernel=TRUE action=ALLOW
> +   op=BPF_PROG_LOAD bpf_signature=OK action=ALLOW
> +
> +This policy allows all other operations but restricts BPF program loading
> +to only programs that either originate from kernel space (e.g. light skeleton
> +reloads) or have a valid signature verified by the Hornet LSM. Unsigned or
> +improperly signed BPF programs from user space will be denied.
> +
> +Allow signed BPF programs from a specific keyring
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +::
> +
> +   policy_name=Allow_BPF_Builtin_Keyring policy_version=0.0.0
> +   DEFAULT action=ALLOW
> +
> +   DEFAULT op=BPF_PROG_LOAD action=DENY
> +   op=BPF_PROG_LOAD bpf_kernel=TRUE action=ALLOW
> +   op=BPF_PROG_LOAD bpf_signature=OK bpf_keyring=BUILTIN action=ALLOW
> +
> +This policy further restricts BPF program loading to only accept programs
> +whose signatures were verified using the builtin trusted keyring. Programs
> +signed against the secondary or platform keyrings will be denied, providing
> +tighter control over which signing keys are acceptable.
> +
> +Allow signed BPF programs with relaxed partial signatures
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +::
> +
> +   policy_name=Allow_BPF_Partial policy_version=0.0.0
> +   DEFAULT action=ALLOW
> +
> +   DEFAULT op=BPF_PROG_LOAD action=DENY
> +   op=BPF_PROG_LOAD bpf_kernel=TRUE action=ALLOW
> +   op=BPF_PROG_LOAD bpf_signature=OK action=ALLOW
> +   op=BPF_PROG_LOAD bpf_signature=PARTIALSIG action=ALLOW
> +
> +This policy allows BPF programs that have been fully verified (``OK``) as
> +well as programs with a valid program signature but without authenticated
> +map hash data (``PARTIALSIG``). This can be useful during development or
> +for programs that do not use maps.
> +
>  Additional Information
>  ----------------------
>
> diff --git a/Documentation/security/ipe.rst b/Documentation/security/ipe.rst
> index 4a7d953abcdc3..de8fcf1dc173d 100644
> --- a/Documentation/security/ipe.rst
> +++ b/Documentation/security/ipe.rst
> @@ -412,6 +412,44 @@ a standard securityfs policy tree::
>
>  The policy is stored in the ``->i_private`` data of the MyPolicy inode.
>
> +BPF/Hornet Integration
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +IPE integrates with the Hornet LSM to enforce integrity policies on BPF
> +program loading. Hornet performs cryptographic verification of BPF program
> +signatures (PKCS#7 with authenticated attributes containing map hashes) and
> +provides an integrity verdict to IPE via the
> +``security_bpf_prog_load_post_integrity`` hook.
> +
> +The hook flow is:
> +
> +  1. User space invokes ``BPF_PROG_LOAD`` via the ``bpf()`` syscall.
> +  2. Hornet's ``bpf_prog_load_integrity`` hook calls ``hornet_check_program()``
> +     to verify the program's signature and map hashes.
> +  3. Hornet calls ``security_bpf_prog_load_post_integrity()`` with the
> +     resulting ``lsm_integrity_verdict``.
> +  4. IPE evaluates the verdict against the active policy's ``BPF_PROG_LOAD``
> +     rules and returns ``-EACCES`` if denied.
> +

This part needs to be updated.

> +Three properties are available for BPF policy rules:
> +
> +  - ``bpf_signature``: Matches against the integrity verdict (OK, UNSIGNED,
> +    BADSIG, etc.)
> +  - ``bpf_keyring``: Matches against the keyring specified in ``bpf_attr``
> +    (BUILTIN, SECONDARY, PLATFORM)
> +  - ``bpf_kernel``: Matches whether the load originated from kernel space
> +    (TRUE/FALSE). This is important because the BPF light skeleton
> +    infrastructure performs a secondary kernel-originated program load that
> +    does not carry a signature.
> +
> +All three properties are gated on ``CONFIG_IPE_PROP_BPF_SIGNATURE`` which
> +depends on ``CONFIG_SECURITY_HORNET``.
> +
> +The evaluation context (``struct ipe_eval_ctx``) carries three BPF-specific
> +fields: ``bpf_verdict`` (the integrity verdict enum), ``bpf_keyring_id``
> +(the ``s32`` keyring ID from ``bpf_attr``), and ``bpf_kernel`` (bool
> +indicating kernel origin).
> +
>  Tests
>  -----
>
> @@ -439,6 +477,7 @@ IPE has KUnit Tests for the policy parser. Recommended kunitconfig::
>    CONFIG_IPE_PROP_DM_VERITY_SIGNATURE=y
>    CONFIG_IPE_PROP_FS_VERITY=y
>    CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG=y
> +  CONFIG_IPE_PROP_BPF_SIGNATURE=y
>    CONFIG_SECURITY_IPE_KUNIT_TEST=y
>
>  In addition, IPE has a python based integration
> diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
> index a110a6cd848b7..4c1d46847582b 100644
> --- a/security/ipe/Kconfig
> +++ b/security/ipe/Kconfig
> @@ -95,6 +95,20 @@ config IPE_PROP_FS_VERITY_BUILTIN_SIG
>
>           if unsure, answer Y.
>
> +config IPE_PROP_BPF_SIGNATURE
> +       bool "Enable support for Hornet BPF program signature verification"
> +       depends on SECURITY_HORNET
> +       help
> +         This option enables the 'bpf_signature' and 'bpf_keyring'

bpf_kernel is missing.

> +         properties within IPE policies. The 'bpf_signature' property
> +         allows IPE to make policy decisions based on the integrity
> +         verdict provided by the Hornet LSM when a BPF program is loaded.
> +         Verdicts include OK, UNSIGNED, PARTIALSIG, BADSIG, and others.
> +         The 'bpf_keyring' property allows policies to match against the
> +         keyring specified in bpf_attr (BUILTIN, SECONDARY, PLATFORM).
> +
> +         If unsure, answer Y.
> +
>  endmenu
>
>  config SECURITY_IPE_KUNIT_TEST
> diff --git a/security/ipe/audit.c b/security/ipe/audit.c
> index 3f0deeb549127..251c6ec2f8423 100644
> --- a/security/ipe/audit.c
> +++ b/security/ipe/audit.c
> @@ -41,6 +41,7 @@ static const char *const audit_op_names[__IPE_OP_MAX + 1] = {
>         "KEXEC_INITRAMFS",
>         "POLICY",
>         "X509_CERT",
> +       "BPF_PROG_LOAD",
>         "UNKNOWN",
>  };
>
> @@ -51,6 +52,7 @@ static const char *const audit_hook_names[__IPE_HOOK_MAX] = {
>         "MPROTECT",
>         "KERNEL_READ",
>         "KERNEL_LOAD",
> +       "BPF_PROG_LOAD",
>  };
>
>  static const char *const audit_prop_names[__IPE_PROP_MAX] = {
> @@ -62,6 +64,19 @@ static const char *const audit_prop_names[__IPE_PROP_MAX] = {
>         "fsverity_digest=",
>         "fsverity_signature=FALSE",
>         "fsverity_signature=TRUE",
> +       "bpf_signature=NONE",
> +       "bpf_signature=OK",
> +       "bpf_signature=UNSIGNED",
> +       "bpf_signature=PARTIALSIG",
> +       "bpf_signature=UNKNOWNKEY",
> +       "bpf_signature=UNEXPECTED",
> +       "bpf_signature=FAULT",
> +       "bpf_signature=BADSIG",
> +       "bpf_keyring=BUILTIN",
> +       "bpf_keyring=SECONDARY",
> +       "bpf_keyring=PLATFORM",
> +       "bpf_kernel=FALSE",
> +       "bpf_kernel=TRUE",
>  };
>
>  /**
> diff --git a/security/ipe/eval.c b/security/ipe/eval.c
> index 21439c5be3364..9a6d583fea125 100644
> --- a/security/ipe/eval.c
> +++ b/security/ipe/eval.c
> @@ -11,6 +11,7 @@
>  #include <linux/rcupdate.h>
>  #include <linux/moduleparam.h>
>  #include <linux/fsverity.h>
> +#include <linux/verification.h>
>
>  #include "ipe.h"
>  #include "eval.h"
> @@ -265,8 +266,52 @@ static bool evaluate_fsv_sig_true(const struct ipe_eval_ctx *const ctx)
>  }
>  #endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
>
> +#ifdef CONFIG_IPE_PROP_BPF_SIGNATURE
> +/**
> + * evaluate_bpf_sig() - Evaluate @ctx against a bpf_signature property.
> + * @ctx: Supplies a pointer to the context being evaluated.
> + * @expected: The expected lsm_integrity_verdict to match against.
> + *
> + * Return:
> + * * %true     - The current @ctx matches the expected verdict
> + * * %false    - The current @ctx doesn't match the expected verdict
> + */
> +static bool evaluate_bpf_sig(const struct ipe_eval_ctx *const ctx,
> +                            enum lsm_integrity_verdict expected)
> +{
> +       return ctx->bpf_verdict == expected;
> +}
> +#else
> +static bool evaluate_bpf_sig(const struct ipe_eval_ctx *const ctx,
> +                            enum lsm_integrity_verdict expected)
> +{
> +       return false;
> +}
> +#endif /* CONFIG_IPE_PROP_BPF_SIGNATURE */
> +
> +#ifdef CONFIG_IPE_PROP_BPF_SIGNATURE
> +/**
> + * evaluate_bpf_keyring() - Evaluate @ctx against a bpf_keyring property.
> + * @ctx: Supplies a pointer to the context being evaluated.
> + * @expected: The expected keyring_id to match against.
> + *
> + * Return:
> + * * %true     - The current @ctx matches the expected keyring
> + * * %false    - The current @ctx doesn't match the expected keyring
> + */
> +static bool evaluate_bpf_keyring(const struct ipe_eval_ctx *const ctx,
> +                                s32 expected)
> +{
> +       return ctx->bpf_keyring_id == expected;
> +}
> +#else
> +static bool evaluate_bpf_keyring(const struct ipe_eval_ctx *const ctx,
> +                                s32 expected)
> +{
> +       return false;
> +}
> +#endif /* CONFIG_IPE_PROP_BPF_SIGNATURE */
>  /**
> - * evaluate_property() - Analyze @ctx against a rule property.
>   * @ctx: Supplies a pointer to the context to be evaluated.
>   * @p: Supplies a pointer to the property to be evaluated.
>   *
> @@ -297,6 +342,32 @@ static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
>                 return evaluate_fsv_sig_false(ctx);
>         case IPE_PROP_FSV_SIG_TRUE:
>                 return evaluate_fsv_sig_true(ctx);
> +       case IPE_PROP_BPF_SIG_NONE:
> +               return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_NONE);
> +       case IPE_PROP_BPF_SIG_OK:
> +               return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_OK);
> +       case IPE_PROP_BPF_SIG_UNSIGNED:
> +               return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_UNSIGNED);
> +       case IPE_PROP_BPF_SIG_PARTIALSIG:
> +               return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_PARTIALSIG);
> +       case IPE_PROP_BPF_SIG_UNKNOWNKEY:
> +               return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_UNKNOWNKEY);
> +       case IPE_PROP_BPF_SIG_UNEXPECTED:
> +               return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_UNEXPECTED);
> +       case IPE_PROP_BPF_SIG_FAULT:
> +               return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_FAULT);
> +       case IPE_PROP_BPF_SIG_BADSIG:
> +               return evaluate_bpf_sig(ctx, LSM_INT_VERDICT_BADSIG);
> +       case IPE_PROP_BPF_KEYRING_BUILTIN:
> +               return evaluate_bpf_keyring(ctx, 0);
> +       case IPE_PROP_BPF_KEYRING_SECONDARY:
> +               return evaluate_bpf_keyring(ctx, (s32)(unsigned long)VERIFY_USE_SECONDARY_KEYRING);
> +       case IPE_PROP_BPF_KEYRING_PLATFORM:
> +               return evaluate_bpf_keyring(ctx, (s32)(unsigned long)VERIFY_USE_PLATFORM_KEYRING);
> +       case IPE_PROP_BPF_KERNEL_FALSE:
> +               return !ctx->bpf_kernel;
> +       case IPE_PROP_BPF_KERNEL_TRUE:
> +               return ctx->bpf_kernel;

bpf_kernel part needs to be guarded by #ifdef, like the other two.

-Fan

>         default:
>                 return false;
>         }

^ permalink raw reply

* Re: [PATCH v2 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: David Windsor @ 2026-05-05  1:07 UTC (permalink / raw)
  To: Song Liu
  Cc: Paul Moore, Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman,
	Kumar Kartikeya Dwivedi, KP Singh, Matt Bobrowski, James Morris,
	Serge E. Hallyn, Mimi Zohar, Roberto Sassu, Dmitry Kasatkin,
	Stephen Smalley, Casey Schaufler, Jan Kara, John Fastabend,
	Martin KaFai Lau, Yonghong Song, Jiri Olsa, Eric Snowberg,
	Ondrej Mosnacek, linux-fsdevel, linux-kernel, bpf,
	linux-security-module, linux-integrity, selinux
In-Reply-To: <CAPhsuW5nDaLAV5UfAHeX6QPeF6bs-WDkFYOzYO7Q9_O6v=jEHA@mail.gmail.com>

On Mon, May 4, 2026 at 7:09 PM Song Liu <song@kernel.org> wrote:
>
> On Tue, May 5, 2026 at 12:42 AM Paul Moore <paul@paul-moore.com> wrote:
> [...]
> > > > Perhaps I'm simply not seeing it, but is there a check to ensure that
> > > > there is only one BPF LSM calling into security_inode_init_security()
> > > > at any given time?  With the BPF LSM only reserving a single xattr
> > > > slot, multiple loaded BPF LSM programs providing
> > > > security_inode_init_security() callbacks will be a problem.
> > >
> > > I don't think there is such a check. Also, a single BPF LSM function
> > > may call the kfunc multiple times, which is also problematic.
> > >

bpf_xattrs_used() guards against this. The lsm_xattr_ctx is shared
between all callers, so xattr additions by another LSM (or by calling
it multiple times in the same function) will be tracked by this.

> > > I think we will need to make the default bigger, and also introduce
> > > some realloc mechanism for the worst case scenario. This should
> > > work, but the code might be a bit messy.
> >
> > Thanks for the clarification, that is what I was afraid of when
> > looking at the code, but I was hoping I was just missing it.
> >
> > Increasing the default is an option, but I don't think we want to
> > support a dynamic reallocation scheme for the xattr slots, that will
> > likely get extremely messy with synchronization between the LSM
> > framework and BPF LSM hook registrations as well as special code to
> > handle inodes with lifetimes that are disjoint from the BPF LSM
> > programs ... I suppose there may be a way to do it, but it will surely
> > be ugly and come at a cost.
>
> BPF trampoline already handles all the synchronizations, such as
> add hook, remove hook, etc. properly. So this is not that hard.
> All we really need is to allocate a new array, copy pointers, and free
> the old array. And we only really need this in the worst case
> scenarios.
>

How many bpf-lsm programs do we envision being attached at once? I'd
think that stacking of bpf-lsms would be difficult to reason about
(moreso than static LSMs) and won't work that well in practice, but
may be wrong. Most LSMs use 1 xattr, Smack is the only one who uses 2
IIRC.

> Thanks,
> Song

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox