Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [PATCH RFC bpf-next 1/4] audit: Implement bpf_audit_log_*() wrappers
From: bot+bpf-ci @ 2026-03-11 22:08 UTC (permalink / raw)
  To: fred, paul, jmorris, serge, eparis, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, shuah, mic, gnoack
  Cc: linux-kernel, linux-security-module, audit, bpf, linux-kselftest,
	kernel-team, fred, ast, andrii, daniel, martin.lau, eddyz87,
	yonghong.song, clm, ihor.solodrai
In-Reply-To: <20260311-bpf-auditd-send-message-v1-1-10a62db5c92f@cloudflare.com>

[-- Attachment #1: Type: text/plain, Size: 2839 bytes --]

> diff --git a/security/lsm_audit_kfuncs.c b/security/lsm_audit_kfuncs.c
> --- /dev/null
> +++ b/security/lsm_audit_kfuncs.c

[ ... ]

> +__bpf_kfunc
> +struct bpf_audit_context *bpf_audit_log_start(struct bpf_prog_aux *aux)
> +{
> +	char comm[sizeof(current->comm)];
> +	struct bpf_audit_context *ac;
> +
> +	ac = bpf_mem_cache_alloc(&bpf_audit_context_ma);
> +	if (!ac)
> +		return NULL;
> +
> +	memset(ac, 0, sizeof(*ac));
> +	ac->ab = audit_log_start(audit_context(),
> +				 (aux->might_sleep) ? GFP_KERNEL : GFP_ATOMIC,
> +				 AUDIT_BPF_LSM_ACCESS);
> +	if (!ac->ab) {
> +		bpf_mem_cache_free(&bpf_audit_context_ma, ac);
> +		return NULL;
> +	}
> +
> +	audit_log_format(ac->ab, "prog-id=%d", aux->id);

aux->id is u32 -- should this be %u?

> +
> +	/* Audit may not have a filter configured for syscalls. Include
> +	 * potentionally redundant pid & comm information

Minor typo: "potentionally" -> "potentially".

> +	 */

[ ... ]

> +__bpf_kfunc int bpf_audit_log_net_sock(struct bpf_audit_context *ac, int netif,
> +				       const struct socket *sock)
> +{
> +	struct lsm_network_audit net = { .sk = sock->sk, .netif = netif };
> +	struct common_audit_data ad;
> +
> +	ad.type = LSM_AUDIT_DATA_NET;
> +	ad.u.net = &net;
> +	return __audit_log_lsm_data(ac, &ad);
> +}
> +
> +__bpf_kfunc int
> +bpf_audit_log_net_sockaddr(struct bpf_audit_context *ac, int netif,
> +			   const struct sockaddr *saddr__nullable,
> +			   const struct sockaddr *daddr__nullable, int addrlen)
> +{
> +	struct lsm_network_audit net;
                                    ^^^

Unlike bpf_audit_log_net_sock() above which uses a designated
initializer that zero-fills the struct, this declaration leaves
net uninitialized.

In particular, net.sk is never assigned anywhere in this function.
When audit_log_lsm_data() handles LSM_AUDIT_DATA_NET, it does:

    if (a->u.net->sk) {
        const struct sock *sk = a->u.net->sk;
        switch (sk->sk_family) {

Since net.sk contains stack garbage, this will likely dereference
an invalid pointer and crash.

Additionally, when only one of saddr__nullable or daddr__nullable
is provided, the corresponding sport/dport and address fields for
the missing side remain uninitialized, and audit_log_lsm_data()
unconditionally prints both sides:

    print_ipv4_addr(ab, a->u.net->v4info.saddr, a->u.net->sport, ...);
    print_ipv4_addr(ab, a->u.net->v4info.daddr, a->u.net->dport, ...);

Would initializing with something like
struct lsm_network_audit net = { .netif = netif }; address both
issues?

> +	struct common_audit_data ad;
> +
> +	net.netif = netif;

[ ... ]


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22976062351

^ permalink raw reply

* [PATCH RFC bpf-next 4/4] selftests/bpf: Add lsm_audit_kfuncs tests
From: Frederick Lawler @ 2026-03-11 21:31 UTC (permalink / raw)
  To: Paul Moore, James Morris, Serge E. Hallyn, Eric Paris,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Mickaël Salaün, Günther Noack
  Cc: linux-kernel, linux-security-module, audit, bpf, linux-kselftest,
	kernel-team, Frederick Lawler
In-Reply-To: <20260311-bpf-auditd-send-message-v1-0-10a62db5c92f@cloudflare.com>

Add selftests for the audit kfunc BPF LSM functionality including
both the test program and BPF progs.

Assisted-by: Claude:claude-4.5-opus
Signed-off-by: Frederick Lawler <fred@cloudflare.com>
---
 .../selftests/bpf/prog_tests/lsm_audit_kfuncs.c    | 598 +++++++++++++++++++++
 .../selftests/bpf/progs/test_lsm_audit_kfuncs.c    | 263 +++++++++
 2 files changed, 861 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_audit_kfuncs.c b/tools/testing/selftests/bpf/prog_tests/lsm_audit_kfuncs.c
new file mode 100644
index 0000000000000000000000000000000000000000..de18e1a3c79578d4151a12a029f2a9e6cc7648e3
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/lsm_audit_kfuncs.c
@@ -0,0 +1,598 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Cloudflare */
+#define _GNU_SOURCE
+
+#include <errno.h>
+#include <fcntl.h>
+#include <poll.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <linux/audit.h>
+#include <linux/netlink.h>
+#include <netinet/in.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <sys/stat.h>
+#include <sys/un.h>
+
+#include "audit_helpers.h"
+#include "test_lsm_audit_kfuncs.skel.h"
+#include "test_progs.h"
+
+#ifndef AUDIT_BPF_LSM_ACCESS
+#define AUDIT_BPF_LSM_ACCESS 1427
+#endif
+
+static inline struct sockaddr_in addr4(void)
+{
+	return (struct sockaddr_in){
+		.sin_family = AF_INET,
+		.sin_port = htons(1234),
+		.sin_addr.s_addr = htonl(INADDR_LOOPBACK),
+	};
+}
+
+static inline struct sockaddr_in6 addr6(void)
+{
+	return (struct sockaddr_in6){
+		.sin6_family = AF_INET6,
+		.sin6_port = htons(1234),
+		.sin6_addr = in6addr_loopback,
+	};
+}
+
+static int bind_connect(const struct sockaddr *addr, int addrlen)
+{
+	int err;
+	int sock;
+	int opt = 1;
+	socklen_t optlen = sizeof(opt);
+
+	sock = socket(addr->sa_family, SOCK_STREAM, 0);
+	if (!ASSERT_OK_FD(sock, "socket"))
+		return 1;
+
+	err = setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &opt, optlen);
+	if (!ASSERT_OK(err, "setsockopt"))
+		goto done;
+
+	err = bind(sock, addr, addrlen);
+	if (!ASSERT_OK(err, "bind"))
+		goto done;
+
+	err = connect(sock, addr, addrlen);
+	ASSERT_OK(err, "connect");
+
+	err = getsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &opt, &optlen);
+	ASSERT_OK(err, "getsockopt");
+
+done:
+	close(sock);
+	return err;
+}
+
+static void test_audit_log_sockaddr_src(struct audit_observer *obs,
+					struct test_lsm_audit_kfuncs *skel)
+{
+	struct sockaddr_in sin = addr4();
+	struct sockaddr_in6 sin6 = addr6();
+	struct bpf_link *link;
+
+	link = bpf_program__attach_lsm(skel->progs.test_sockaddr_src);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"bind4\" saddr=127.0.0.1 src=1234 netif=lo",
+			      1);
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"bind6\" saddr=::1 src=1234 netif=lo", 1);
+
+	if (bind_connect((const struct sockaddr *)&sin, sizeof(sin)))
+		goto done;
+
+	if (bind_connect((const struct sockaddr *)&sin6, sizeof(sin6)))
+		goto done;
+
+	ASSERT_OK(audit_observer_wait(obs), "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+done:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_sockaddr_dest(struct audit_observer *obs,
+					 struct test_lsm_audit_kfuncs *skel)
+{
+	struct sockaddr_in sin = addr4();
+	struct sockaddr_in6 sin6 = addr6();
+	struct bpf_link *link;
+
+	link = bpf_program__attach_lsm(skel->progs.test_sockaddr_dest);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"connect4\" daddr=127.0.0.1 dest=1234 netif=lo",
+			      1);
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"connect6\" daddr=::1 dest=1234 netif=lo",
+			      1);
+
+	if (bind_connect((const struct sockaddr *)&sin, sizeof(sin)))
+		goto out;
+
+	if (bind_connect((const struct sockaddr *)&sin6, sizeof(sin6)))
+		goto out;
+
+	ASSERT_OK(audit_observer_wait(obs), "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_sock(struct audit_observer *obs,
+				struct test_lsm_audit_kfuncs *skel)
+{
+	struct sockaddr_in sin = addr4();
+	struct sockaddr_in6 sin6 = addr6();
+	struct bpf_link *link;
+
+	link = bpf_program__attach_lsm(skel->progs.test_sock);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"sock4\" laddr=127.0.0.1 lport=1234 faddr=127.0.0.1 fport=1234 netif=lo",
+			1);
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"sock6\" laddr=::1 lport=1234 faddr=::1 fport=1234 netif=lo",
+			1);
+
+	if (bind_connect((const struct sockaddr *)&sin, sizeof(sin)))
+		goto out;
+
+	if (bind_connect((const struct sockaddr *)&sin6, sizeof(sin6)))
+		goto out;
+
+	ASSERT_OK(audit_observer_wait(obs), "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_sock_unix(struct audit_observer *obs,
+				     struct test_lsm_audit_kfuncs *skel)
+{
+	struct sockaddr_un addr;
+	struct bpf_link *link;
+	char expected[256];
+	char sun_path[108];
+	int server_fd = -1;
+	int opt = 1;
+	socklen_t optlen = sizeof(opt);
+	int err;
+
+	snprintf(sun_path, sizeof(sun_path), "/root/tmp/bpf_audit_test_%d.sock",
+		 getpid());
+
+	/* Ensure directory exists */
+	mkdir("/root/tmp", 0755);
+	unlink(sun_path);
+
+	link = bpf_program__attach_lsm(skel->progs.test_sock_unix);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	snprintf(expected, sizeof(expected), "cause=\"sock_unix\" path=\"%s\"",
+		 sun_path);
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS, expected, 1);
+
+	memset(&addr, 0, sizeof(addr));
+	addr.sun_family = AF_UNIX;
+	strncpy(addr.sun_path, sun_path, sizeof(addr.sun_path) - 1);
+
+	server_fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (!ASSERT_OK_FD(server_fd, "socket"))
+		goto out;
+
+	err = bind(server_fd, (struct sockaddr *)&addr, sizeof(addr));
+	if (!ASSERT_OK(err, "bind"))
+		goto out;
+
+	err = getsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR, &opt, &optlen);
+	ASSERT_OK(err, "getsockopt");
+
+	ASSERT_OK(audit_observer_wait(obs), "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	if (server_fd >= 0)
+		close(server_fd);
+	unlink(sun_path);
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_file(struct audit_observer *obs,
+				struct test_lsm_audit_kfuncs *skel)
+{
+	struct bpf_link *link;
+	int err;
+	int fd;
+
+	link = bpf_program__attach_lsm(skel->progs.test_file);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"file\" path=\"/dev/null\" dev=\"devtmpfs\" ino=4",
+			1);
+
+	fd = open("/dev/null", O_RDONLY);
+	close(fd);
+	if (!ASSERT_OK_FD(fd, "open(/dev/null)"))
+		goto out;
+
+	err = audit_observer_wait(obs);
+	ASSERT_OK(err, "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_path(struct audit_observer *obs,
+				struct test_lsm_audit_kfuncs *skel)
+{
+	struct bpf_link *link;
+	int err;
+	int fd;
+
+	link = bpf_program__attach_lsm(skel->progs.test_file_path);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"path\" path=\"/dev/null\" dev=\"devtmpfs\" ino=4",
+			      1);
+
+	fd = open("/dev/null", O_RDONLY);
+	close(fd);
+	if (!ASSERT_OK_FD(fd, "open(/dev/null)"))
+		goto out;
+
+	err = audit_observer_wait(obs);
+	ASSERT_OK(err, "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_dentry(struct audit_observer *obs,
+				  struct test_lsm_audit_kfuncs *skel)
+{
+	struct bpf_link *link;
+	char expected[128];
+	char buf[64];
+	int err;
+
+	link = bpf_program__attach_lsm(skel->progs.test_dentry);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	snprintf(expected, sizeof(expected),
+		 "cause=\"dentry\" name=\"exe\" dev=");
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS, expected, 1);
+
+	/* readlink triggers inode_readlink hook */
+	err = readlink("/proc/self/exe", buf, sizeof(buf));
+	if (!ASSERT_GT(err, 0, "readlink(/proc/self/exe)"))
+		goto out;
+
+	err = audit_observer_wait(obs);
+	ASSERT_OK(err, "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_inode(struct audit_observer *obs,
+				 struct test_lsm_audit_kfuncs *skel)
+{
+	struct bpf_link *link;
+	char expected[128];
+	struct stat st;
+	int err;
+	int fd;
+
+	if (!ASSERT_OK(stat("/dev/null", &st), "stat(/dev/null)"))
+		return;
+
+	link = bpf_program__attach_lsm(skel->progs.test_inode);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	snprintf(expected, sizeof(expected),
+		 "cause=\"inode\" name=\"null\" dev=\"devtmpfs\" ino=%lu",
+		 st.st_ino);
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS, expected, 1);
+
+	fd = open("/dev/null", O_RDONLY);
+	close(fd);
+	if (!ASSERT_OK_FD(fd, "open(/dev/null)"))
+		goto out;
+
+	err = audit_observer_wait(obs);
+	ASSERT_OK(err, "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_task(struct audit_observer *obs,
+				struct test_lsm_audit_kfuncs *skel)
+{
+	struct bpf_link *link;
+	char expected[128];
+	pid_t pid;
+	int err;
+
+	pid = getpid();
+
+	link = bpf_program__attach_lsm(skel->progs.test_task);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	snprintf(expected, sizeof(expected),
+		 "cause=\"task\" opid=%d ocomm=\"test_progs\"", pid);
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS, expected, 1);
+
+	err = getpgid(pid);
+	if (!ASSERT_GT(err, -1, "pid pgid match"))
+		goto out;
+
+	err = audit_observer_wait(obs);
+	ASSERT_OK(err, "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_cap(struct audit_observer *obs,
+			       struct test_lsm_audit_kfuncs *skel)
+{
+	struct bpf_link *link;
+	int err;
+	int fd;
+
+	link = bpf_program__attach_lsm(skel->progs.test_cap);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"cap\" capability=", 1);
+
+	fd = open("/proc/kallsyms", O_RDONLY);
+	close(fd);
+	if (!ASSERT_OK_FD(fd, "open(/proc/kallsyms)"))
+		goto out;
+
+	err = audit_observer_wait(obs);
+	ASSERT_OK(err, "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_ioctl_op(struct audit_observer *obs,
+				    struct test_lsm_audit_kfuncs *skel)
+{
+	struct bpf_link *link;
+	char expected[128];
+	struct stat st;
+	int err;
+	int fd;
+
+	if (!ASSERT_OK(stat("/dev/null", &st), "stat(/dev/null)"))
+		return;
+
+	link = bpf_program__attach_lsm(skel->progs.test_ioctl_op);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	snprintf(expected, sizeof(expected),
+		 "cause=\"ioctl_op\" path=\"/dev/null\" dev=\"devtmpfs\" ino=%lu ioctlcmd=0x%x",
+		st.st_ino, TCGETS);
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS, expected, 1);
+
+	fd = open("/dev/null", O_RDONLY);
+	if (!ASSERT_OK_FD(fd, "open(/dev/null)"))
+		goto out;
+
+	/* ioctl will fail with ENOTTY but the LSM hook fires regardless */
+	ioctl(fd, TCGETS, NULL);
+	close(fd);
+
+	err = audit_observer_wait(obs);
+	ASSERT_OK(err, "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void test_audit_log_sleepable(struct audit_observer *obs,
+				     struct test_lsm_audit_kfuncs *skel)
+{
+	struct bpf_link *link;
+	int err;
+	int fd;
+
+	link = bpf_program__attach_lsm(skel->progs.test_sleepable);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"sleepable\" path=\"/dev/null\" dev=\"devtmpfs\" ino=4",
+		1);
+
+	fd = open("/dev/null", O_RDONLY);
+	close(fd);
+	if (!ASSERT_OK_FD(fd, "open(/dev/null)"))
+		goto out;
+
+	err = audit_observer_wait(obs);
+	ASSERT_OK(err, "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+out:
+	bpf_link__destroy(link);
+}
+
+static void
+test_audit_log_sockaddr_both_null(struct audit_observer *obs,
+				  struct test_lsm_audit_kfuncs *skel)
+{
+	struct sockaddr_in sin = addr4();
+	struct bpf_link *link;
+
+	link = bpf_program__attach_lsm(skel->progs.test_sockaddr_both_null);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	/* Should see cause but no saddr/daddr since both were NULL */
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"sockaddr_both_null\"", 1);
+
+	bind_connect((const struct sockaddr *)&sin, sizeof(sin));
+
+	ASSERT_OK(audit_observer_wait(obs), "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+	bpf_link__destroy(link);
+}
+
+static void
+test_audit_log_sockaddr_small_addrlen(struct audit_observer *obs,
+				      struct test_lsm_audit_kfuncs *skel)
+{
+	struct sockaddr_in sin = addr4();
+	struct bpf_link *link;
+
+	link = bpf_program__attach_lsm(skel->progs.test_sockaddr_small_addrlen);
+	if (!ASSERT_OK_PTR(link, "attach"))
+		return;
+
+	audit_observer_reset(obs);
+
+	/* Should see cause but no saddr since addrlen was too small */
+	audit_observer_expect(obs, AUDIT_BPF_LSM_ACCESS,
+			      "cause=\"sockaddr_small_addrlen\"", 1);
+
+	bind_connect((const struct sockaddr *)&sin, sizeof(sin));
+
+	ASSERT_OK(audit_observer_wait(obs), "audit_observer_wait");
+	ASSERT_TRUE(audit_observer_check_satisfied(obs),
+		    "all expectations met");
+
+	bpf_link__destroy(link);
+}
+
+void test_lsm_audit_kfuncs(void)
+{
+	struct test_lsm_audit_kfuncs *skel = NULL;
+	struct audit_observer obs;
+	FILE *log = NULL;
+	int audit_fd;
+
+	audit_fd = audit_init();
+	if (!ASSERT_GE(audit_fd, 0, "audit_init"))
+		return;
+
+	if (env.verbosity > VERBOSE_NONE)
+		log = env.stdout_saved;
+
+	audit_observer_init(&obs, audit_fd, log, 500);
+
+	skel = test_lsm_audit_kfuncs__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel load"))
+		goto close_prog;
+
+	if (test__start_subtest("net")) {
+		test_audit_log_sockaddr_src(&obs, skel);
+		test_audit_log_sockaddr_dest(&obs, skel);
+		test_audit_log_sockaddr_both_null(&obs, skel);
+		test_audit_log_sockaddr_small_addrlen(&obs, skel);
+		test_audit_log_sock(&obs, skel);
+		test_audit_log_sock_unix(&obs, skel);
+	}
+
+	if (test__start_subtest("file")) {
+		test_audit_log_file(&obs, skel);
+		test_audit_log_path(&obs, skel);
+		test_audit_log_dentry(&obs, skel);
+		test_audit_log_inode(&obs, skel);
+	}
+
+	if (test__start_subtest("task")) {
+		test_audit_log_task(&obs, skel);
+		test_audit_log_cap(&obs, skel);
+	}
+
+	if (test__start_subtest("ioctl"))
+		test_audit_log_ioctl_op(&obs, skel);
+
+	if (test__start_subtest("sleepable"))
+		test_audit_log_sleepable(&obs, skel);
+
+close_prog:
+	test_lsm_audit_kfuncs__destroy(skel);
+	audit_cleanup(audit_fd);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_lsm_audit_kfuncs.c b/tools/testing/selftests/bpf/progs/test_lsm_audit_kfuncs.c
new file mode 100644
index 0000000000000000000000000000000000000000..952ba09fce638f3bd14c18060a5baa3ccaec19ca
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_lsm_audit_kfuncs.c
@@ -0,0 +1,263 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Cloudflare */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+#include <errno.h>
+
+#define AF_UNIX 1
+#define AF_INET 2
+#define AF_INET6 10
+
+char _license[] SEC("license") = "GPL";
+
+SEC("lsm/socket_bind")
+int BPF_PROG(test_sockaddr_src, struct socket *sock, struct sockaddr *address,
+	     int addrlen)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	switch (address->sa_family) {
+	case AF_INET:
+		bpf_audit_log_cause(ac, "bind4");
+		break;
+	case AF_INET6:
+		bpf_audit_log_cause(ac, "bind6");
+	}
+
+	bpf_audit_log_net_sockaddr(ac, 1, address, NULL, addrlen);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/socket_connect")
+int BPF_PROG(test_sockaddr_dest, struct socket *sock, struct sockaddr *address,
+	     int addrlen)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	switch (address->sa_family) {
+	case AF_INET:
+		bpf_audit_log_cause(ac, "connect4");
+		break;
+	case AF_INET6:
+		bpf_audit_log_cause(ac, "connect6");
+	}
+
+	bpf_audit_log_net_sockaddr(ac, 1, NULL, address, addrlen);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/socket_bind")
+int BPF_PROG(test_sockaddr_both_null, struct socket *sock,
+	     struct sockaddr *address, int addrlen)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "sockaddr_both_null");
+	bpf_audit_log_net_sockaddr(ac, 1, NULL, NULL, addrlen);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/socket_bind")
+int BPF_PROG(test_sockaddr_small_addrlen, struct socket *sock,
+	     struct sockaddr *address, int addrlen)
+{
+	struct bpf_audit_context *ac;
+
+	if (address->sa_family != AF_INET)
+		return -EINVAL;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "sockaddr_small_addrlen");
+	bpf_audit_log_net_sockaddr(ac, 1, address, NULL, 1);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/socket_getsockopt")
+int BPF_PROG(test_sock, struct socket *sock, int level, int optname)
+{
+	struct bpf_audit_context *ac;
+	struct sock *sk = sock->sk;
+
+	if (!sk)
+		return -EINVAL;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	switch (sk->__sk_common.skc_family) {
+	case AF_INET:
+		bpf_audit_log_cause(ac, "sock4");
+		break;
+	case AF_INET6:
+		bpf_audit_log_cause(ac, "sock6");
+	}
+
+	bpf_audit_log_net_sock(ac, 1, sock);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/socket_getsockopt")
+int BPF_PROG(test_sock_unix, struct socket *sock, int level, int optname)
+{
+	struct bpf_audit_context *ac;
+	struct sock *sk = sock->sk;
+
+	if (!sk || sk->__sk_common.skc_family != AF_UNIX)
+		return -EINVAL;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "sock_unix");
+	bpf_audit_log_net_sock(ac, 0, sock);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/file_open")
+int BPF_PROG(test_file, struct file *file)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "file");
+	bpf_audit_log_file(ac, file);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/file_open")
+int BPF_PROG(test_file_path, struct file *file)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "path");
+	bpf_audit_log_path(ac, &file->f_path);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/inode_readlink")
+int BPF_PROG(test_dentry, struct dentry *dentry)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "dentry");
+	bpf_audit_log_dentry(ac, dentry);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/file_open")
+int BPF_PROG(test_inode, struct file *file)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "inode");
+	bpf_audit_log_inode(ac, file->f_inode);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/task_getpgid")
+int BPF_PROG(test_task, struct task_struct *task)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "task");
+	bpf_audit_log_task(ac, task);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/capable")
+int BPF_PROG(test_cap, const struct cred *cred, struct user_namespace *ns,
+	     int cap, unsigned int opts)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "cap");
+	bpf_audit_log_cap(ac, cap);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm/file_ioctl")
+int BPF_PROG(test_ioctl_op, struct file *file, unsigned int cmd,
+	     unsigned long arg)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "ioctl_op");
+	bpf_audit_log_ioctl_op(ac, file, cmd);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+int BPF_PROG(test_sleepable, struct file *file)
+{
+	struct bpf_audit_context *ac;
+
+	ac = bpf_audit_log_start();
+	if (!ac)
+		return -ENOMEM;
+
+	bpf_audit_log_cause(ac, "sleepable");
+	bpf_audit_log_file(ac, file);
+	bpf_audit_log_end(ac);
+	return 0;
+}
+

-- 
2.43.0


^ permalink raw reply related

* [PATCH RFC bpf-next 3/4] selftests/bpf: Add audit helpers for BPF tests
From: Frederick Lawler @ 2026-03-11 21:31 UTC (permalink / raw)
  To: Paul Moore, James Morris, Serge E. Hallyn, Eric Paris,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Mickaël Salaün, Günther Noack
  Cc: linux-kernel, linux-security-module, audit, bpf, linux-kselftest,
	kernel-team, Frederick Lawler
In-Reply-To: <20260311-bpf-auditd-send-message-v1-0-10a62db5c92f@cloudflare.com>

Add audit helper utilities for reading and parsing audit messages
in BPF selftests.

Assisted-by: Claude:claude-4.5-opus
Signed-off-by: Frederick Lawler <fred@cloudflare.com>
---
 tools/testing/selftests/bpf/Makefile        |   3 +-
 tools/testing/selftests/bpf/audit_helpers.c | 281 ++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/audit_helpers.h |  55 ++++++
 3 files changed, 338 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 869b582b1d1ff496fb07736597708487be3438ed..76a428539add5e03fe3811b41c55005c22f5cead 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -754,7 +754,8 @@ TRUNNER_EXTRA_SOURCES := test_progs.c		\
 			 flow_dissector_load.h	\
 			 ip_check_defrag_frags.h	\
 			 bpftool_helpers.c	\
-			 usdt_1.c usdt_2.c
+			 usdt_1.c usdt_2.c	\
+			 audit_helpers.c
 TRUNNER_LIB_SOURCES := find_bit.c
 TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read				\
 		       $(OUTPUT)/liburandom_read.so			\
diff --git a/tools/testing/selftests/bpf/audit_helpers.c b/tools/testing/selftests/bpf/audit_helpers.c
new file mode 100644
index 0000000000000000000000000000000000000000..a105136a581f92a1af73b9456b1e85dc88176678
--- /dev/null
+++ b/tools/testing/selftests/bpf/audit_helpers.c
@@ -0,0 +1,281 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * BPF audit helpers
+ *
+ * Borrowed code from tools/selftests/landlock/audit.h
+ *
+ * Copyright (C) 2024-2025 Microsoft Corporation
+ * Copyright (c) 2026 Cloudflare
+ */
+#define _GNU_SOURCE
+
+#include <errno.h>
+#include <fcntl.h>
+#include <poll.h>
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <linux/audit.h>
+#include <linux/netlink.h>
+#include <netinet/in.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <sys/stat.h>
+#include <sys/un.h>
+
+#include "audit_helpers.h"
+
+static __u32 seq;
+
+int audit_init(void)
+{
+	int bufsize = 1024 * 1024; /* 1MB receive buffer */
+	struct audit_message msg;
+	int fd, err;
+
+	fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_AUDIT);
+	if (fd < 0)
+		return -errno;
+
+	/*
+	 * Increase receive buffer to reduce kernel-side queueing.
+	 * When the socket buffer fills up, audit records get queued in
+	 * the kernel's hold/retry queues and delivered on subsequent runs.
+	 */
+	setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &bufsize, sizeof(bufsize));
+
+	seq = 0;
+	err = audit_send(fd, AUDIT_SET, AUDIT_STATUS_ENABLED, 1);
+	if (err)
+		goto out_close;
+
+	do {
+		err = audit_recv(fd, &msg, 0);
+		if (err < 0)
+			goto out_close;
+	} while (msg.nlh.nlmsg_type != NLMSG_ERROR);
+
+	if (msg.err.error)
+		goto out_close;
+
+	err = audit_send(fd, AUDIT_SET, AUDIT_STATUS_PID, getpid());
+	if (err)
+		goto out_close;
+
+	do {
+		err = audit_recv(fd, &msg, 0);
+		if (err < 0)
+			goto out_close;
+	} while (msg.nlh.nlmsg_type != NLMSG_ERROR);
+
+	if (msg.err.error)
+		goto out_close;
+
+	return fd;
+
+out_close:
+	close(fd);
+	return err;
+}
+
+void audit_cleanup(int fd)
+{
+	if (fd > 0)
+		close(fd);
+}
+
+int audit_send(int fd, __u16 type, __u32 key, __u32 val)
+{
+	struct audit_message msg = {
+		.nlh = {
+			.nlmsg_len = NLMSG_SPACE(sizeof(msg.status)),
+			.nlmsg_type = type,
+			.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK,
+			.nlmsg_seq = ++seq,
+		},
+		.status = {
+			.mask = key,
+			.enabled = key == AUDIT_STATUS_ENABLED ? val : 0,
+			.pid = key == AUDIT_STATUS_PID ? val : 0,
+		},
+	};
+	struct sockaddr_nl addr = { .nl_family = AF_NETLINK };
+	int ret;
+
+	do {
+		ret = sendto(fd, &msg, msg.nlh.nlmsg_len, 0,
+			     (struct sockaddr *)&addr, sizeof(addr));
+	} while (ret < 0 && errno == EINTR);
+
+	return ret == msg.nlh.nlmsg_len ? 0 : -errno;
+}
+
+/*
+ * Receive an audit message from the netlink socket.
+ * Returns:
+ *   > 0: message type on success
+ *   0: ACK received (NLMSG_ERROR with error=0)
+ *   < 0: negative errno on error
+ */
+int audit_recv(int fd, struct audit_message *msg, int flags)
+{
+	struct sockaddr_nl addr;
+	socklen_t addrlen = sizeof(addr);
+	int ret;
+
+	do {
+		ret = recvfrom(fd, msg, sizeof(*msg), flags,
+			       (struct sockaddr *)&addr, &addrlen);
+	} while (ret < 0 && errno == EINTR);
+
+	if (ret < 0)
+		return -errno;
+
+	/* Must be from kernel (pid 0) */
+	if (addrlen != sizeof(addr) || addr.nl_pid != 0)
+		return -EINVAL;
+
+	/*
+	 * NLMSG_ERROR with error=0 is an ACK. The kernel sends this in
+	 * response to messages with NLM_F_ACK flag set.
+	 */
+	if (msg->nlh.nlmsg_type == NLMSG_ERROR) {
+		if (msg->err.error == 0)
+			return 0; /* ACK */
+		return msg->err.error;
+	}
+
+	return msg->nlh.nlmsg_type;
+}
+
+__printf(2, 3) static inline void
+debug(struct audit_observer *obs, const char *fmt, ...)
+{
+	va_list args;
+
+	if (!obs || !obs->log)
+		return;
+
+	va_start(args, fmt);
+	vfprintf(obs->log, fmt, args);
+	va_end(args);
+}
+
+void audit_observer_init(struct audit_observer *obs, int audit_fd, FILE *log,
+			 int wait_timeout_ms)
+{
+	obs->audit_fd = audit_fd;
+	obs->wait_timeout = wait_timeout_ms;
+
+	if (log)
+		obs->log = log;
+
+	audit_observer_reset(obs);
+}
+
+void audit_observer_reset(struct audit_observer *obs)
+{
+	memset(obs->expects, 0, sizeof(obs->expects));
+	obs->num_expects = 0;
+}
+
+int audit_observer_expect(struct audit_observer *obs, int audit_type,
+			  const char *pattern, int count)
+{
+	struct audit_expectation *exp;
+
+	if (obs->num_expects >= AUDIT_EXPECT_MAX)
+		return -EINVAL;
+
+	exp = &obs->expects[obs->num_expects++];
+	exp->type = audit_type;
+	exp->pattern = pattern;
+	exp->expected_count = count;
+	exp->matched_count = 0;
+	return 0;
+}
+
+/*
+ * Check if a message matches any pending expectation.
+ * Returns 1 if all expectations are satisfied, 0 otherwise.
+ */
+static int audit_observer_match(struct audit_observer *obs,
+				struct audit_message *msg)
+{
+	int all_satisfied = 1;
+
+	for (int i = 0; i < obs->num_expects; i++) {
+		struct audit_expectation *exp = &obs->expects[i];
+
+		if (exp->matched_count >= exp->expected_count)
+			continue;
+
+		/* Check if this message matches */
+		if (exp->type && msg->nlh.nlmsg_type != exp->type)
+			goto check_satisfied;
+
+		if (strstr(msg->data, exp->pattern)) {
+			exp->matched_count++;
+			debug(obs, "%s: matched [%d/%d] %s\n", __func__,
+			      exp->matched_count, exp->expected_count,
+			      exp->pattern);
+		}
+
+check_satisfied:
+		if (exp->matched_count < exp->expected_count)
+			all_satisfied = 0;
+	}
+
+	return all_satisfied;
+}
+
+/*
+ * Wait for all expected audit messages to arrive.
+ * Returns 0 on success (all expectations met), -ETIMEDOUT on timeout.
+ */
+int audit_observer_wait(struct audit_observer *obs)
+{
+	struct pollfd pfd = { .fd = obs->audit_fd, .events = POLLIN };
+	struct audit_message msg;
+	int ret;
+
+	while (1) {
+		ret = poll(&pfd, 1, obs->wait_timeout);
+		if (ret < 0)
+			return -errno;
+		if (ret == 0)
+			return -ETIMEDOUT;
+
+		memset(&msg, 0, sizeof(msg));
+		ret = audit_recv(obs->audit_fd, &msg, MSG_DONTWAIT);
+
+		if (ret == -EAGAIN || ret == -EWOULDBLOCK)
+			continue;
+
+		if (ret <= 0)
+			continue;
+
+		debug(obs, "%s: recv type=%d %s\n", __func__,
+		      msg.nlh.nlmsg_type, msg.data);
+
+		if (audit_observer_match(obs, &msg))
+			return 0;
+	}
+}
+
+int audit_observer_check_satisfied(struct audit_observer *obs)
+{
+	for (int i = 0; i < obs->num_expects; i++) {
+		struct audit_expectation *exp = &obs->expects[i];
+
+		if (exp->matched_count < exp->expected_count) {
+			debug(obs, "%s: FAILED pattern '%s' got %d/%d\n",
+			      __func__, exp->pattern, exp->matched_count,
+			      exp->expected_count);
+			return 0;
+		}
+	}
+
+	return 1;
+}
diff --git a/tools/testing/selftests/bpf/audit_helpers.h b/tools/testing/selftests/bpf/audit_helpers.h
new file mode 100644
index 0000000000000000000000000000000000000000..40f3d20635bb25c305067756897593f34d54531e
--- /dev/null
+++ b/tools/testing/selftests/bpf/audit_helpers.h
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2026 Cloudflare */
+#pragma once
+
+#include <linux/audit.h>
+#include <linux/netlink.h>
+#include <stdio.h>
+
+#define MAX_AUDIT_MESSAGE_LENGTH 8970
+
+struct audit_message {
+	struct nlmsghdr nlh;
+	union {
+		struct audit_status status;
+		struct nlmsgerr err;
+		char data[MAX_AUDIT_MESSAGE_LENGTH];
+	};
+};
+
+/*
+ * Observer-based audit message matching.
+ * Tests register expected patterns before triggering events, then
+ * wait for matches. Messages that don't match any pattern are skipped.
+ */
+#define AUDIT_EXPECT_MAX 32
+
+struct audit_expectation {
+	__u16 type;
+	const char *pattern;
+	int expected_count;
+	int matched_count;
+};
+
+struct audit_observer {
+	struct audit_expectation expects[AUDIT_EXPECT_MAX];
+	int num_expects;
+	FILE *log;
+	int wait_timeout;
+	int audit_fd;
+};
+
+int audit_init(void);
+void audit_cleanup(int fd);
+int audit_wait_ack(int fd);
+int audit_send(int fd, __u16 type, __u32 key, __u32 val);
+int audit_recv(int fd, struct audit_message *msg, int flags);
+int audit_wait_ack(int fd);
+
+void audit_observer_init(struct audit_observer *obs, int audit_fd, FILE *log,
+			 int wait_timeout);
+void audit_observer_reset(struct audit_observer *obs);
+int audit_observer_expect(struct audit_observer *obs, int audit_type,
+			  const char *pattern, int count);
+int audit_observer_wait(struct audit_observer *obs);
+int audit_observer_check_satisfied(struct audit_observer *obs);

-- 
2.43.0


^ permalink raw reply related

* [PATCH RFC bpf-next 2/4] audit/security: Enable audit BPF kfuncs
From: Frederick Lawler @ 2026-03-11 21:31 UTC (permalink / raw)
  To: Paul Moore, James Morris, Serge E. Hallyn, Eric Paris,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Mickaël Salaün, Günther Noack
  Cc: linux-kernel, linux-security-module, audit, bpf, linux-kselftest,
	kernel-team, Frederick Lawler
In-Reply-To: <20260311-bpf-auditd-send-message-v1-0-10a62db5c92f@cloudflare.com>

Enable audit BPF kfuncs.

Signed-off-by: Frederick Lawler <fred@cloudflare.com>
---
 security/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/security/Makefile b/security/Makefile
index 4601230ba442a1bcedc3f999b74a7796ac72894d..de980b2797c1f8f8d0eaeb1be949c41e6ecb8fc1 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -16,6 +16,8 @@ obj-$(CONFIG_SECURITYFS)		+= inode.o
 obj-$(CONFIG_SECURITY_SELINUX)		+= selinux/
 obj-$(CONFIG_SECURITY_SMACK)		+= smack/
 obj-$(CONFIG_HAS_SECURITY_AUDIT)	+= lsm_audit.o
+lsm_audit-y += lsm_audit.o
+lsm_audit-$(CONFIG_BPF_LSM)	+= lsm_audit_kfuncs.o
 obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/

-- 
2.43.0


^ permalink raw reply related

* [PATCH RFC bpf-next 1/4] audit: Implement bpf_audit_log_*() wrappers
From: Frederick Lawler @ 2026-03-11 21:31 UTC (permalink / raw)
  To: Paul Moore, James Morris, Serge E. Hallyn, Eric Paris,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Mickaël Salaün, Günther Noack
  Cc: linux-kernel, linux-security-module, audit, bpf, linux-kselftest,
	kernel-team, Frederick Lawler
In-Reply-To: <20260311-bpf-auditd-send-message-v1-0-10a62db5c92f@cloudflare.com>

The primary use case is to provide LSM designers a direct API to report
access allow/denies through the audit subsystem similar to how LSM's
traditionally log their accesses.

Left out from this API are functions that are potentially abuseable such as
audit_log_format() where users may fill any field=value pair. Instead, the
API mostly follows what is exposed through security/lsm_audit.c for
consistency with user space audit expectations. Further calls to functions
report once to avoid repeated-call abuse.

Lastly, each audit record corresponds to the loaded BPF program's ID to
track which program reported the log entry. This helps remove
ambiguity in the event multiple programs are registered to the same
security hook.

Exposed functions:

	bpf_audit_log_start()
	bpf_audit_log_end()
	bpf_audit_log_cause()
	bpf_audit_log_cap()
	bpf_audit_log_path()
	bpf_audit_log_file()
	bpf_audit_log_ioctl_op()
	bpf_audit_log_dentry()
	bpf_audit_log_inode()
	bpf_audit_log_task()
	bpf_audit_log_net_sock()
	bpf_audit_log_net_sockaddr()

Signed-off-by: Frederick Lawler <fred@cloudflare.com>
---
 include/linux/lsm_audit.h   |   1 +
 include/uapi/linux/audit.h  |   1 +
 security/lsm_audit_kfuncs.c | 306 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 308 insertions(+)

diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
index 382c56a97bba1d0e5efe082553338229d541e267..859f51590de417ac246309eb75a760b8632224be 100644
--- a/include/linux/lsm_audit.h
+++ b/include/linux/lsm_audit.h
@@ -78,6 +78,7 @@ struct common_audit_data {
 #define LSM_AUDIT_DATA_NOTIFICATION 16
 #define LSM_AUDIT_DATA_ANONINODE	17
 #define LSM_AUDIT_DATA_NLMSGTYPE	18
+#define LSM_AUDIT_DATA_CAUSE 19 /* unused */
 	union 	{
 		struct path path;
 		struct dentry *dentry;
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 14a1c1fe013acecb12ea6bf81690965421baa7ff..7a22e214fe3e421decfc4109d2e6a3cee996fe51 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -150,6 +150,7 @@
 #define AUDIT_LANDLOCK_DOMAIN	1424	/* Landlock domain status */
 #define AUDIT_MAC_TASK_CONTEXTS	1425	/* Multiple LSM task contexts */
 #define AUDIT_MAC_OBJ_CONTEXTS	1426	/* Multiple LSM objext contexts */
+#define AUDIT_BPF_LSM_ACCESS		1427	/* LSM BPF MAC events */
 
 #define AUDIT_FIRST_KERN_ANOM_MSG   1700
 #define AUDIT_LAST_KERN_ANOM_MSG    1799
diff --git a/security/lsm_audit_kfuncs.c b/security/lsm_audit_kfuncs.c
new file mode 100644
index 0000000000000000000000000000000000000000..0d4fb20be34a61db29aa2c48d2aefc39131e73bf
--- /dev/null
+++ b/security/lsm_audit_kfuncs.c
@@ -0,0 +1,306 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2026 Cloudflare */
+
+#include <linux/audit.h>
+#include <linux/bpf_mem_alloc.h>
+#include <linux/gfp_types.h>
+#include <linux/in6.h>
+#include <linux/lsm_audit.h>
+#include <linux/socket.h>
+#include <linux/types.h>
+
+struct bpf_audit_context {
+	struct audit_buffer *ab;
+	u64 log_once_mask;
+};
+
+static struct bpf_mem_alloc bpf_audit_context_ma;
+
+static inline u64 log_once(struct bpf_audit_context *ac, u64 mask)
+{
+	u64 set = (ac->log_once_mask & mask);
+
+	ac->log_once_mask |= mask;
+	return set;
+}
+
+static inline int __audit_log_lsm_data(struct bpf_audit_context *ac,
+				       struct common_audit_data *ad)
+{
+	if (log_once(ac, BIT_ULL(ad->type)))
+		return -EINVAL;
+
+	audit_log_lsm_data(ac->ab, ad);
+	return 0;
+}
+
+__bpf_kfunc_start_defs();
+
+__bpf_kfunc
+struct bpf_audit_context *bpf_audit_log_start(struct bpf_prog_aux *aux)
+{
+	char comm[sizeof(current->comm)];
+	struct bpf_audit_context *ac;
+
+	ac = bpf_mem_cache_alloc(&bpf_audit_context_ma);
+	if (!ac)
+		return NULL;
+
+	memset(ac, 0, sizeof(*ac));
+	ac->ab = audit_log_start(audit_context(),
+				 (aux->might_sleep) ? GFP_KERNEL : GFP_ATOMIC,
+				 AUDIT_BPF_LSM_ACCESS);
+	if (!ac->ab) {
+		bpf_mem_cache_free(&bpf_audit_context_ma, ac);
+		return NULL;
+	}
+
+	audit_log_format(ac->ab, "prog-id=%d", aux->id);
+
+	/* Audit may not have a filter configured for syscalls. Include
+	 * potentionally redundant pid & comm information
+	 */
+	audit_log_format(ac->ab, " pid=%d comm=", task_tgid_nr(current));
+	audit_log_untrustedstring(ac->ab, get_task_comm(comm, current));
+
+	return ac;
+}
+
+__bpf_kfunc void bpf_audit_log_end(struct bpf_audit_context *ac)
+{
+	audit_log_end(ac->ab);
+	bpf_mem_cache_free(&bpf_audit_context_ma, ac);
+}
+
+__bpf_kfunc int bpf_audit_log_cause(struct bpf_audit_context *ac,
+				    const char *cause__str)
+{
+	if (log_once(ac, BIT_ULL(LSM_AUDIT_DATA_CAUSE)))
+		return -EINVAL;
+
+	audit_log_format(ac->ab, " cause=");
+	audit_log_untrustedstring(ac->ab, cause__str);
+	return 0;
+}
+
+__bpf_kfunc int bpf_audit_log_cap(struct bpf_audit_context *ac, int cap)
+{
+	struct common_audit_data ad;
+
+	ad.type = LSM_AUDIT_DATA_CAP;
+	ad.u.cap = cap;
+	return __audit_log_lsm_data(ac, &ad);
+}
+
+__bpf_kfunc int bpf_audit_log_path(struct bpf_audit_context *ac,
+				   const struct path *path)
+{
+	struct common_audit_data ad;
+
+	/* DATA_PATH prints similar to DATA_FILE */
+	if (log_once(ac, BIT_ULL(LSM_AUDIT_DATA_FILE)))
+		return -EINVAL;
+
+	ad.type = LSM_AUDIT_DATA_PATH;
+	ad.u.path = *path;
+	return __audit_log_lsm_data(ac, &ad);
+}
+
+__bpf_kfunc int bpf_audit_log_file(struct bpf_audit_context *ac,
+				   struct file *file)
+{
+	struct common_audit_data ad;
+
+	/* DATA_PATH prints similar to DATA_FILE */
+	if (log_once(ac, BIT_ULL(LSM_AUDIT_DATA_PATH)))
+		return -EINVAL;
+
+	ad.type = LSM_AUDIT_DATA_FILE;
+	ad.u.file = file;
+	return __audit_log_lsm_data(ac, &ad);
+}
+
+__bpf_kfunc int bpf_audit_log_ioctl_op(struct bpf_audit_context *ac,
+				       struct file *file, u16 cmd)
+{
+	struct lsm_ioctlop_audit op = { .path = file->f_path, .cmd = cmd };
+	struct common_audit_data ad;
+
+	ad.type = LSM_AUDIT_DATA_IOCTL_OP;
+	ad.u.op = &op;
+	return __audit_log_lsm_data(ac, &ad);
+}
+
+__bpf_kfunc int bpf_audit_log_dentry(struct bpf_audit_context *ac,
+				     struct dentry *dentry)
+{
+	struct common_audit_data ad;
+
+	/* DATA_DENTRY prints similar to DATA_INODE */
+	if (log_once(ac, BIT_ULL(LSM_AUDIT_DATA_INODE)))
+		return -EINVAL;
+
+	ad.type = LSM_AUDIT_DATA_DENTRY;
+	ad.u.dentry = dentry;
+	return __audit_log_lsm_data(ac, &ad);
+}
+
+__bpf_kfunc int bpf_audit_log_inode(struct bpf_audit_context *ac,
+				    struct inode *inode)
+{
+	struct common_audit_data ad;
+
+	/* DATA_DENTRY prints similar to DATA_INODE */
+	if (log_once(ac, BIT_ULL(LSM_AUDIT_DATA_DENTRY)))
+		return -EINVAL;
+
+	ad.type = LSM_AUDIT_DATA_INODE;
+	ad.u.inode = inode;
+	return __audit_log_lsm_data(ac, &ad);
+}
+
+__bpf_kfunc int bpf_audit_log_task(struct bpf_audit_context *ac,
+				   struct task_struct *tsk)
+{
+	struct common_audit_data ad;
+
+	ad.type = LSM_AUDIT_DATA_TASK;
+	ad.u.tsk = tsk;
+	return __audit_log_lsm_data(ac, &ad);
+}
+
+__bpf_kfunc int bpf_audit_log_net_sock(struct bpf_audit_context *ac, int netif,
+				       const struct socket *sock)
+{
+	struct lsm_network_audit net = { .sk = sock->sk, .netif = netif };
+	struct common_audit_data ad;
+
+	ad.type = LSM_AUDIT_DATA_NET;
+	ad.u.net = &net;
+	return __audit_log_lsm_data(ac, &ad);
+}
+
+__bpf_kfunc int
+bpf_audit_log_net_sockaddr(struct bpf_audit_context *ac, int netif,
+			   const struct sockaddr *saddr__nullable,
+			   const struct sockaddr *daddr__nullable, int addrlen)
+{
+	struct lsm_network_audit net;
+	struct common_audit_data ad;
+
+	net.netif = netif;
+
+	if (!saddr__nullable && !daddr__nullable)
+		return -EINVAL;
+
+	if (saddr__nullable && daddr__nullable &&
+	    saddr__nullable->sa_family != daddr__nullable->sa_family)
+		return -EINVAL;
+
+	if (saddr__nullable)
+		net.family = saddr__nullable->sa_family;
+	else
+		net.family = daddr__nullable->sa_family;
+
+	switch (net.family) {
+#if IS_ENABLED(CONFIG_IPV6)
+	case AF_INET6:
+		if (addrlen < SIN6_LEN_RFC2133)
+			return -EINVAL;
+
+		if (saddr__nullable) {
+			struct sockaddr_in6 *saddr =
+				(struct sockaddr_in6 *)saddr__nullable;
+			net.fam.v6.saddr = saddr->sin6_addr;
+			net.sport = saddr->sin6_port;
+		}
+
+		if (daddr__nullable) {
+			struct sockaddr_in6 *daddr =
+				(struct sockaddr_in6 *)daddr__nullable;
+			net.fam.v6.daddr = daddr->sin6_addr;
+			net.dport = daddr->sin6_port;
+		}
+		break;
+#endif
+	case AF_INET:
+		if (addrlen < sizeof(struct sockaddr_in))
+			return -EINVAL;
+
+		if (saddr__nullable) {
+			struct sockaddr_in *saddr =
+				(struct sockaddr_in *)saddr__nullable;
+			net.fam.v4.saddr = saddr->sin_addr.s_addr;
+			net.sport = saddr->sin_port;
+		}
+
+		if (daddr__nullable) {
+			struct sockaddr_in *daddr =
+				(struct sockaddr_in *)daddr__nullable;
+			net.fam.v4.daddr = daddr->sin_addr.s_addr;
+			net.dport = daddr->sin_port;
+		}
+		break;
+	default:
+		return -EAFNOSUPPORT;
+	}
+
+	ad.type = LSM_AUDIT_DATA_NET;
+	ad.u.net = &net;
+	return __audit_log_lsm_data(ac, &ad);
+}
+
+__bpf_kfunc_end_defs();
+
+BTF_KFUNCS_START(lsm_audit_set_ids)
+
+BTF_ID_FLAGS(func, bpf_audit_log_start,
+	     KF_ACQUIRE | KF_DESTRUCTIVE | KF_IMPLICIT_ARGS | KF_RET_NULL);
+
+BTF_ID_FLAGS(func, bpf_audit_log_end, KF_DESTRUCTIVE | KF_RELEASE);
+
+/* The following have a recursion opportunity if a LSM is attached to any of
+ * the following functions, and a bpf_audit_log_*() is called.
+ *  security_current_getlsmprop_subj,
+ *  security_lsmprop_to_secctx, or
+ *  security_release_secctx
+ */
+BTF_ID_FLAGS(func, bpf_audit_log_cause, KF_DESTRUCTIVE);
+BTF_ID_FLAGS(func, bpf_audit_log_cap, KF_DESTRUCTIVE);
+BTF_ID_FLAGS(func, bpf_audit_log_path, KF_DESTRUCTIVE);
+BTF_ID_FLAGS(func, bpf_audit_log_file, KF_DESTRUCTIVE);
+BTF_ID_FLAGS(func, bpf_audit_log_ioctl_op, KF_DESTRUCTIVE);
+BTF_ID_FLAGS(func, bpf_audit_log_dentry, KF_DESTRUCTIVE);
+BTF_ID_FLAGS(func, bpf_audit_log_inode, KF_DESTRUCTIVE);
+BTF_ID_FLAGS(func, bpf_audit_log_task, KF_DESTRUCTIVE);
+BTF_ID_FLAGS(func, bpf_audit_log_net_sock, KF_DESTRUCTIVE);
+BTF_ID_FLAGS(func, bpf_audit_log_net_sockaddr, KF_DESTRUCTIVE);
+
+BTF_KFUNCS_END(lsm_audit_set_ids)
+
+static int bpf_lsm_audit_kfuncs_filter(const struct bpf_prog *prog,
+				       u32 kfunc_id)
+{
+	if (!btf_id_set8_contains(&lsm_audit_set_ids, kfunc_id))
+		return 0;
+
+	return prog->type != BPF_PROG_TYPE_LSM ? -EACCES : 0;
+}
+
+static const struct btf_kfunc_id_set bpf_lsm_audit_set = {
+	.owner = THIS_MODULE,
+	.set = &lsm_audit_set_ids,
+	.filter = bpf_lsm_audit_kfuncs_filter,
+};
+
+static int lsm_audit_init_bpf(void)
+{
+	int ret;
+
+	ret = bpf_mem_alloc_init(&bpf_audit_context_ma,
+				 sizeof(struct bpf_audit_context), false);
+	return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM,
+						 &bpf_lsm_audit_set);
+}
+
+late_initcall(lsm_audit_init_bpf)

-- 
2.43.0


^ permalink raw reply related

* [PATCH RFC bpf-next 0/4] audit: Expose audit subsystem to BPF LSM programs via BPF kfuncs
From: Frederick Lawler @ 2026-03-11 21:31 UTC (permalink / raw)
  To: Paul Moore, James Morris, Serge E. Hallyn, Eric Paris,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Mickaël Salaün, Günther Noack
  Cc: linux-kernel, linux-security-module, audit, bpf, linux-kselftest,
	kernel-team, Frederick Lawler

The motivation behind the change is to give BPF LSM developers the
ability to report accesses via the audit subsystem much like how LSMs
operate today.

Series:

Patch 1: Introduces bpf_audit_*() kfuncs
Patch 2: Enables bpf_audit_*() kfuns
Patch 3: Prepares audit helpers used for testing
Patch 4: Adds self tests

Documentation will be added when this becomes a versioned series.

Key features:

1. Audit logs include type=AUDIT_BPF_LSM_ACCESS, BPF program ID, and comm
that triggered the hook by default

We wanted audit log consumers to be able to track who and what created
the entry. prog-id=%d is already used for BPF LOAD/UNLOAD logs, thus
is reused here for this distinction. Though, it may be better to use
the tag instead to capture which _specific_ version of the program
made the log, since prog-id can be reused.

2. Leverages BPF KF_AQUIRE/KF_RELEASE semantics to force use of
  bpf_audit_log_end().

One side effect of this decision is that the BPF documentation states
that these flags allow the pointer to struct bpf_audit_context to be 
stored in a map, and then exchanged through bpf_kptr_xchg(). However,
there's prior work with net/netfilter/nf_conntrack_bpf.c such that the
struct is not exposed as a kptr to support that functionality nor is
that supplying a dtor function. The verifier will not allow this use case
due to not exposing the __kptr. Ideally, we don't want the pointer to
be exchanged anyway because the reporting program can become ambiguous.
I am sure there are other edge cases WRT to keeping the audit buffer in a
strange state too that I cannot think of at this moment.

3. All bpf_audit_log_*() functions are destructive

The audit subsystem allows for AUDIT_FAIL_PANIC to be set when the
subsystem can detect that missing events. Further, some call paths may
invoke a BUG_ON(). Therefore all the functions are marked destructive.

4. Functions are callable once per bpf_audit_context

The rationale for this was to prevent abuse. Logs with repeated fields
are not helpful, and may not be handled by user space audit coherently.

This is in the same vein as not providing a audit_format() wrapper.

Similarly, some functions such as bpf_audit_log_path() and
bpf_audit_log_file() report the same information, thus can be
interchangeable in use.

5. API wraps security/lsm_audit.c

lsm_audit.c functions are multiplexed and not handled by BPF verifier
very well, thus the wrapped functions are isolated to their sole
purpose for use within hooks.

Key considerations:

1. Audit field ordering

AFAIK, user space audit is particular about what fields are
present and their order. This patch series does not address ordering.

My assumption is that the first three fields: type, prog-id, pid, comm
are well known, and user space can make an assumption that other
fields after those can appear in any order.

If that is not acceptable, I would propose that we leverage the struct
common_audit_data type order to be the order--much like how the type is
used for log_once() functionality.

I am open to other ideas.

Signed-off-by: Frederick Lawler <fred@cloudflare.com>
---
Frederick Lawler (4):
      audit: Implement bpf_audit_log_*() wrappers
      audit/security: Enable audit BPF kfuncs
      selftests/bpf: Add audit helpers for BPF tests
      selftests/bpf: Add lsm_audit_kfuncs tests

 include/linux/lsm_audit.h                          |   1 +
 include/uapi/linux/audit.h                         |   1 +
 security/Makefile                                  |   2 +
 security/lsm_audit_kfuncs.c                        | 306 +++++++++++
 tools/testing/selftests/bpf/Makefile               |   3 +-
 tools/testing/selftests/bpf/audit_helpers.c        | 281 ++++++++++
 tools/testing/selftests/bpf/audit_helpers.h        |  55 ++
 .../selftests/bpf/prog_tests/lsm_audit_kfuncs.c    | 598 +++++++++++++++++++++
 .../selftests/bpf/progs/test_lsm_audit_kfuncs.c    | 263 +++++++++
 9 files changed, 1509 insertions(+), 1 deletion(-)
---
base-commit: ca0f39a369c5f927c3d004e63a5a778b08a9df94
change-id: 20260105-bpf-auditd-send-message-4a883067aab8

Best regards,
-- 
Frederick Lawler <fred@cloudflare.com>

^ permalink raw reply

* Re: [PATCH v2 0/5] rust: lsm: introduce safe Rust abstractions for the LSM framework
From: Paul Moore @ 2026-03-11 21:16 UTC (permalink / raw)
  To: Jamie Lindsey
  Cc: rust-for-linux, Alice Ryhl, linux-security-module, ojeda, jmorris,
	serge
In-Reply-To: <CAH5fLgiQm=2YYvmG54o-MEt2m8x5V5xZrtmsqEUtuB9OZ=FPOw@mail.gmail.com>

On Wed, Mar 11, 2026 at 2:49 AM Alice Ryhl <aliceryhl@google.com> wrote:
> On Wed, Mar 11, 2026 at 6:09 AM Jamie Lindsey <jamie@matrixforgelabs.com> wrote:
> >
> > v2: add missing Signed-off-by tags, fix short commit hash in patch 4.
> > No code changes from v1.
> >
> > This series introduces the first safe Rust abstractions for the Linux
> > Security Module (LSM) framework.  It allows a complete, policy-enforcing
> > LSM to be written entirely in Rust with no C boilerplate required from
> > the LSM author.
> >
> > --- Motivation ---
> >
> > The LSM framework is a natural target for Rust: hook registration is
> > unsafe by nature (raw function pointers, C ABI, __randomize_layout on
> > the hook list struct), and the trait system can enforce correct
> > implementation at compile time.
>
> Hi Jamie,
>
> What is the intended end-user of these abstractions?

Building on Alice's question, I wanted to mention that we don't
accept/merge example LSMs into the upstream Linux kernel.  I'm
supportive of using Rust to develop new LSMs, and I recognize that
developing a meaningful LSM in Rust will require significant
shim/plumbing work, but that shim work needs to be done in conjunction
with a real LSM.

In case it may be helpful, I wanted to point out some previous work on
developing a LSM in Rust:

https://lore.kernel.org/linux-security-module/20250416213206.26060-2-kernel@o1oo11oo.de

... and if you are serious about developing a proper LSM in Rust, here
is some guidance for developing and submitting new LSMs upstream:

https://github.com/LinuxSecurityModule/kernel/blob/main/README.md#new-lsms

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v4 15/17] module: Introduce hash-based integrity checking
From: Eric Biggers @ 2026-03-11 21:14 UTC (permalink / raw)
  To: Thomas Weißschuh
  Cc: Nathan Chancellor, Arnd Bergmann, Luis Chamberlain, Petr Pavlu,
	Sami Tolvanen, Daniel Gomez, Paul Moore, James Morris,
	Serge E. Hallyn, Jonathan Corbet, Madhavan Srinivasan,
	Michael Ellerman, Nicholas Piggin, Naveen N Rao, Mimi Zohar,
	Roberto Sassu, Dmitry Kasatkin, Eric Snowberg, Nicolas Schier,
	Daniel Gomez, Aaron Tomlin, Christophe Leroy (CS GROUP),
	Nicolas Schier, Nicolas Bouchinet, Xiu Jianfeng,
	Fabian Grünbichler, Arnout Engelen, Mattia Rizzolo, kpcyrd,
	Christian Heusel, Câju Mihai-Drosi,
	Sebastian Andrzej Siewior, linux-kbuild, linux-kernel, linux-arch,
	linux-modules, linux-security-module, linux-doc, linuxppc-dev,
	linux-integrity
In-Reply-To: <5726fc65-7d24-4353-b341-81b785f2575c@t-8ch.de>

On Wed, Mar 11, 2026 at 02:19:02PM +0100, Thomas Weißschuh wrote:
> > > diff --git a/include/linux/module_signature.h b/include/linux/module_signature.h
> > > index a45ce3b24403..3b510651830d 100644
> > > --- a/include/linux/module_signature.h
> > > +++ b/include/linux/module_signature.h
> > > @@ -18,6 +18,7 @@ enum pkey_id_type {
> > >  	PKEY_ID_PGP,		/* OpenPGP generated key ID */
> > >  	PKEY_ID_X509,		/* X.509 arbitrary subjectKeyIdentifier */
> > >  	PKEY_ID_PKCS7,		/* Signature in PKCS#7 message */
> > > +	PKEY_ID_MERKLE,		/* Merkle proof for modules */
> > 
> > I recommend making the hash algorithm explicit:
> > 
> >         PKEY_ID_MERKLE_SHA256,	/* SHA-256 merkle proof for modules */
> > 
> > While I wouldn't encourage the addition of another hash algorithm
> > (specifying one good algorithm for now is absolutely the right choice),
> > if someone ever does need to add another one, we'd want them to be
> > guided to simply introduce a new value of this enum rather than hack it
> > in some other way.
> 
> The idea here was that this will only ever be used for module built as
> part of the kernel build. So the actual implementation could change freely
> without affecting anything.
> 
> But I don't have hard feelings about it.

Ah, okay.  That's even better then: if someone adds another algorithm it
would simply be a kconfig option.

It seems 'struct module_signature' itself is intended to be a stable
ABI, though.  So I think there's an opportunity for confusion here.  It
might be worth leaving a note somewhere that the format of the
PKEY_ID_MERKLE portion of the struct does not need to be kept stable and
can freely change in each kernel build.

- Eric

^ permalink raw reply

* Re: [PATCH v2 7/10] security: Hornet LSM
From: Paul Moore @ 2026-03-11 20:50 UTC (permalink / raw)
  To: Blaise Boscaccy, Blaise Boscaccy, Jonathan Corbet, James Morris,
	Serge E. Hallyn, Mickaël Salaün, Günther Noack,
	Dr. David Alan Gilbert, Andrew Morton, James.Bottomley, dhowells,
	Fan Wu, Ryan Foster, linux-security-module, linux-doc,
	linux-kernel, bpf
In-Reply-To: <20260227233930.2418522-8-bboscaccy@linux.microsoft.com>

On Feb 27, 2026 Blaise Boscaccy <bboscaccy@linux.microsoft.com> wrote:
> 
> This adds the Hornet Linux Security Module which provides enhanced
> signature verification and data validation for eBPF programs. This
> allows users to continue to maintain an invariant that all code
> running inside of the kernel has actually been signed and verified, by
> the kernel.
> 
> This effort builds upon the currently excepted upstream solution. It
> further hardens it by providing deterministic, in-kernel checking of
> map hashes to solidify auditing along with preventing TOCTOU attacks
> against lskel map hashes.
> 
> Target map hashes are passed in via PKCS#7 signed attributes. Hornet
> determines the extent which the eBFP program is signed and defers to
> other LSMs for policy decisions.
> 
> Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
> Nacked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> ---
>  Documentation/admin-guide/LSM/Hornet.rst | 310 ++++++++++++++++++++++
>  Documentation/admin-guide/LSM/index.rst  |   1 +
>  MAINTAINERS                              |   9 +
>  include/linux/oid_registry.h             |   3 +
>  include/uapi/linux/lsm.h                 |   1 +
>  security/Kconfig                         |   3 +-
>  security/Makefile                        |   1 +
>  security/hornet/Kconfig                  |  11 +
>  security/hornet/Makefile                 |   7 +
>  security/hornet/hornet.asn1              |  13 +
>  security/hornet/hornet_lsm.c             | 323 +++++++++++++++++++++++
>  11 files changed, 681 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/admin-guide/LSM/Hornet.rst
>  create mode 100644 security/hornet/Kconfig
>  create mode 100644 security/hornet/Makefile
>  create mode 100644 security/hornet/hornet.asn1
>  create mode 100644 security/hornet/hornet_lsm.c
> 
> diff --git a/Documentation/admin-guide/LSM/Hornet.rst b/Documentation/admin-guide/LSM/Hornet.rst
> new file mode 100644
> index 000000000000..0dd4c03b8a7e
> --- /dev/null
> +++ b/Documentation/admin-guide/LSM/Hornet.rst
> @@ -0,0 +1,310 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +======
> +Hornet
> +======
> +
> +Hornet is a Linux Security Module that provides extensible signature
> +verification for eBPF programs. This is selectable at build-time with
> +``CONFIG_SECURITY_HORNET``.
> +
> +Overview
> +========
> +
> +Hornet addresses concerns from users who require strict audit trails and
> +verification guarantees for eBPF programs, especially in
> +security-sensitive environments. Many production systems need assurance
> +that only authorized, unmodified eBPF programs are loaded into the
> +kernel. Hornet provides this assurance through cryptographic signature
> +verification.
> +
> +When an eBPF program is loaded via the ``bpf()`` syscall, Hornet
> +verifies a PKCS#7 signature attached to the program instructions. The
> +signature is checked against the kernel's secondary keyring using the
> +existing kernel cryptographic infrastructure. In addition to signing the
> +program bytecode, Hornet supports signing SHA-256 hashes of associated
> +BPF maps, enabling integrity verification of map contents at load time
> +and at runtime.
> +
> +After verification, Hornet classifies the program into one of the
> +following integrity states and passes the result to a downstream LSM hook
> +(``bpf_prog_load_post_integrity``), allowing other security modules to
> +make policy decisions based on the verification outcome:
> +
> +``LSM_INT_VERDICT_OK``
> +  The program signature and all map hashes verified successfully.
> +
> +``LSM_INT_VERDICT_UNSIGNED``
> +  No signature was provided with the program.
> +
> +``LSM_INT_VERDICT_PARTIALSIG``
> +  The program signature verified, but the signing certificate is not
> +  trusted in the secondary keyring ...

Do you think there is value in separating this case out from _PARTIALSIG?
Maybe a LSM_INT_VERDICT_UNKNOWNKEY?

> +  ... or the signature did not contain
> +  hornet map hash data.
> +
> +``LSM_INT_VERDICT_BADSIG``
> +  The signature or a map hash failed verification.
> +
> +Hornet itself does not enforce a policy on whether unsigned or partially
> +signed programs should be rejected. It delegates that decision to
> +downstream LSMs via the ``bpf_prog_load_post_integrity`` hook, making it
> +a composable building block in a larger security architecture.
> +
> +Use Cases
> +=========
> +
> +- **Locked-down production environments**: Ensure only eBPF programs
> +  signed by a trusted authority can be loaded, preventing unauthorized
> +  or tampered programs from running in the kernel.
> +
> +- **Audit and compliance**: Provide cryptographic evidence that loaded
> +  eBPF programs match their expected build artifacts, supporting
> +  compliance requirements in regulated industries.
> +
> +- **Supply chain integrity**: Verify that eBPF programs and their
> +  associated map data have not been modified since they were built and
> +  signed, protecting against supply chain attacks.
> +
> +Threat Model
> +============
> +
> +Hornet protects against the following threats:
> +
> +- **Unauthorized eBPF program loading**: Programs that have not been
> +  signed by a trusted key will be reported as unsigned or badly signed.
> +
> +- **Tampering with program instructions**: Any modification to the eBPF
> +  bytecode after signing will cause signature verification to fail.
> +
> +- **Tampering with map data**: When map hashes are included in the
> +  signature, Hornet verifies that frozen BPF maps match their expected
> +  SHA-256 hashes at load time. Maps are also re-verified before program
> +  execution via ``BPF_PROG_RUN``.
> +
> +Hornet does **not** protect against:
> +
> +- Compromise of the signing key itself.
> +- Attacks that occur after a program has been loaded and verified.
> +- Programs loaded by the kernel itself (kernel-internal loads bypass
> +  the ``BPF_PROG_RUN`` map check).
> +
> +Known Limitations
> +=================
> +
> +- Hornet requires programs to use :doc:`light skeletons
> +  </bpf/libbpf/libbpf_naming_convention>` (lskels) for the signing
> +  workflow, as the tooling operates on lskel-generated headers.
> +
> +- A maximum of 64 maps per program can be tracked for hash
> +  verification.
> +
> +- Map hash verification requires the maps to be frozen before loading.
> +  Maps that are not frozen at load time will cause verification to fail
> +  when their hashes are included in the signature.
> +
> +- Hornet relies on the kernel's secondary keyring
> +  (``VERIFY_USE_SECONDARY_KEYRING``) for certificate trust. Keys must
> +  be provisioned into this keyring before programs can be verified.

I would add a bullet point describing the SHA256 limitation.  If I
understand things correctly this restriction comes from the core BPF
code and not Hornet itself, so it would be nice to have this documented
as it isn't immediately clear when looking only at the Hornet code.

> +Configuration
> +=============
> +
> +Build Configuration
> +-------------------
> +
> +Enable Hornet by setting the following kernel configuration option::
> +
> +  CONFIG_SECURITY_HORNET=y
> +
> +This option is found under :menuselection:`Security options --> Hornet
> +support` and depends on ``CONFIG_SECURITY``.
> +
> +When enabled, Hornet is included in the default LSM initialization order
> +and will appear in ``/sys/kernel/security/lsm``.
> +
> +Architecture
> +============
> +
> +Signature Verification Flow
> +---------------------------
> +
> +The following describes what happens when a userspace program calls
> +``bpf(BPF_PROG_LOAD, ...)`` with a signature attached:
> +
> +1. The ``bpf_prog_load_integrity`` LSM hook is invoked.
> +
> +2. Hornet reads the signature from the userspace buffer specified by
> +   ``attr->signature`` (with length ``attr->signature_size``).
> +
> +3. The PKCS#7 signature is verified against the program instructions
> +   using ``verify_pkcs7_signature()`` with the kernel's secondary
> +   keyring.
> +
> +4. The PKCS#7 message is parsed and its trust chain is validated via
> +   ``validate_pkcs7_trust()``.
> +
> +5. Hornet extracts the authenticated attribute identified by
> +   ``OID_hornet_data`` (OID ``2.25.316487325684022475439036912669789383960``)
> +   from the PKCS#7 message. This attribute contains an ASN.1-encoded set
> +   of map index/hash pairs.
> +
> +6. For each map hash entry, Hornet retrieves the corresponding BPF map
> +   via its file descriptor, confirms it is frozen, computes its SHA-256
> +   hash, and compares it against the signed hash.
> +
> +7. The resulting integrity verdict is passed to the
> +   ``bpf_prog_load_post_integrity`` hook so that downstream LSMs can
> +   enforce policy.
> +
> +Runtime Map Verification
> +------------------------
> +
> +When ``bpf(BPF_PROG_RUN, ...)`` is called from userspace, Hornet
> +re-verifies the hashes of all maps associated with the program. This
> +ensures that map contents have not been modified between program load
> +and execution. If any map hash no longer matches, the ``BPF_PROG_RUN``
> +command is denied.
> +
> +Userspace Interface
> +-------------------
> +
> +Signatures are passed to the kernel through fields in ``union bpf_attr``
> +when using the ``BPF_PROG_LOAD`` command:
> +
> +``signature``
> +  A pointer to a userspace buffer containing the PKCS#7 signature.
> +
> +``signature_size``
> +  The size of the signature buffer in bytes.
> +
> +ASN.1 Schema
> +------------
> +
> +Map hashes are encoded as a signed attribute in the PKCS#7 message using
> +the following ASN.1 schema::
> +
> +  HornetData ::= SET OF Map
> +
> +  Map ::= SEQUENCE {
> +      index   INTEGER,
> +      sha     OCTET STRING
> +  }
> +
> +Each ``Map`` entry contains the index of the map in the program's
> +``fd_array`` and its expected SHA-256 hash. A zero-length ``sha`` field
> +indicates that the map at that index should be skipped during
> +verification.
> +
> +Tooling
> +=======
> +
> +Helper scripts and a signature generation tool are provided in
> +``scripts/hornet/`` to support the development of signed eBPF light
> +skeletons.
> +
> +gen_sig
> +-------
> +
> +``gen_sig`` is a C program (using OpenSSL) that creates a PKCS#7
> +signature over eBPF program instructions and optionally includes
> +SHA-256 hashes of BPF maps as signed attributes.
> +
> +Usage::
> +
> +  gen_sig --data <instructions.bin> \
> +          --cert <signer.crt> \
> +          --key <signer.key> \
> +          [--pass <passphrase>] \
> +          --out <signature.p7b> \
> +          [--add <mapfile.bin>:<index> ...]
> +
> +``--data``
> +  Path to the binary file containing eBPF program instructions to sign.
> +
> +``--cert``
> +  Path to the signing certificate (PEM or DER format).
> +
> +``--key``
> +  Path to the private key (PEM or DER format).
> +
> +``--pass``
> +  Optional passphrase for the private key.
> +
> +``--out``
> +  Path to write the output PKCS#7 signature.
> +
> +``--add``
> +  Attach a map hash as a signed attribute. The argument is a path to a
> +  binary map file followed by a colon and the map's index in the
> +  ``fd_array``. This option may be specified multiple times.
> +
> +extract-skel.sh
> +---------------
> +
> +Extracts a named field from an autogenerated eBPF lskel header file.
> +Used internally by other helper scripts.
> +
> +extract-insn.sh
> +---------------
> +
> +Extracts the eBPF program instructions (``opts_insn``) from an lskel
> +header into a binary file suitable for signing with ``gen_sig``.
> +
> +extract-map.sh
> +--------------
> +
> +Extracts the map data (``opts_data``) from an lskel header into a
> +binary file suitable for hashing with ``gen_sig``.
> +
> +write-sig.sh
> +------------
> +
> +Replaces the signature data in an lskel header with a new signature
> +from a binary file. This is used to embed a freshly generated signature
> +back into the header after signing.
> +
> +Signing Workflow
> +================
> +
> +A typical workflow for building and signing an eBPF light skeleton is:
> +
> +1. **Compile the eBPF program**::
> +
> +     clang -O2 -target bpf -c program.bpf.c -o program.bpf.o
> +
> +2. **Generate the light skeleton header** using ``bpftool``::
> +
> +     bpftool gen skeleton -S program.bpf.o > loader.h
> +
> +3. **Extract instructions and map data** from the generated header::
> +
> +     scripts/hornet/extract-insn.sh loader.h > insn.bin
> +     scripts/hornet/extract-map.sh loader.h > map.bin
> +
> +4. **Generate the signature** with ``gen_sig``::
> +
> +     scripts/hornet/gen_sig \
> +       --key signing_key.pem \
> +       --cert signing_key.x509 \
> +       --data insn.bin \
> +       --add map.bin:0 \
> +       --out sig.bin
> +
> +5. **Embed the signature** back into the header::
> +
> +     scripts/hornet/write-sig.sh loader.h sig.bin > signed_loader.h
> +
> +6. **Build the loader program** using the signed header::
> +
> +     cc -o loader loader.c -lbpf
> +
> +The resulting loader program will pass the embedded signature to the
> +kernel when loading the eBPF program, enabling Hornet to verify it.
> +
> +Testing
> +=======
> +
> +Self-tests are provided in ``tools/testing/selftests/hornet/``. The test
> +suite builds a minimal eBPF program (``trivial.bpf.c``), signs it using
> +the workflow described above, and verifies that the signed program loads
> +successfully.
> diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> index b44ef68f6e4d..57f6e9fbe5fd 100644
> --- a/Documentation/admin-guide/LSM/index.rst
> +++ b/Documentation/admin-guide/LSM/index.rst
> @@ -49,3 +49,4 @@ subdirectories.
>     SafeSetID
>     ipe
>     landlock
> +   Hornet
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 55af015174a5..6e91234a9ba4 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -11682,6 +11682,15 @@ S:	Maintained
>  F:	Documentation/devicetree/bindings/iio/pressure/honeywell,mprls0025pa.yaml
>  F:	drivers/iio/pressure/mprls0025pa*
>  
> +HORNET SECURITY MODULE
> +M:	Blaise Boscaccy <bboscaccy@linux.microsoft.com>
> +L:	linux-security-module@vger.kernel.org
> +S:	Supported
> +T:	git https://github.com/blaiseboscaccy/hornet.git
> +F:	Documentation/admin-guide/LSM/Hornet.rst
> +F:	scripts/hornet/
> +F:	security/hornet/
> +
>  HP BIOSCFG DRIVER
>  M:	Jorge Lopez <jorge.lopez2@hp.com>
>  L:	platform-driver-x86@vger.kernel.org
> diff --git a/include/linux/oid_registry.h b/include/linux/oid_registry.h
> index ebce402854de..bf852715aaea 100644
> --- a/include/linux/oid_registry.h
> +++ b/include/linux/oid_registry.h
> @@ -150,6 +150,9 @@ enum OID {
>  	OID_id_ml_dsa_65,			/* 2.16.840.1.101.3.4.3.18 */
>  	OID_id_ml_dsa_87,			/* 2.16.840.1.101.3.4.3.19 */
>  
> +	/* Hornet LSM */
> +	OID_hornet_data,	  /* 2.25.316487325684022475439036912669789383960 */
> +
>  	OID__NR
>  };
>  
> diff --git a/include/uapi/linux/lsm.h b/include/uapi/linux/lsm.h
> index 938593dfd5da..2ff9bcdd551e 100644
> --- a/include/uapi/linux/lsm.h
> +++ b/include/uapi/linux/lsm.h
> @@ -65,6 +65,7 @@ struct lsm_ctx {
>  #define LSM_ID_IMA		111
>  #define LSM_ID_EVM		112
>  #define LSM_ID_IPE		113
> +#define LSM_ID_HORNET		114
>  
>  /*
>   * LSM_ATTR_XXX definitions identify different LSM attributes
> diff --git a/security/Kconfig b/security/Kconfig
> index 6a4393fce9a1..283c4a103209 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -230,6 +230,7 @@ source "security/safesetid/Kconfig"
>  source "security/lockdown/Kconfig"
>  source "security/landlock/Kconfig"
>  source "security/ipe/Kconfig"
> +source "security/hornet/Kconfig"
>  
>  source "security/integrity/Kconfig"
>  
> @@ -274,7 +275,7 @@ config LSM
>  	default "landlock,lockdown,yama,loadpin,safesetid,apparmor,selinux,smack,tomoyo,ipe,bpf" if DEFAULT_SECURITY_APPARMOR
>  	default "landlock,lockdown,yama,loadpin,safesetid,tomoyo,ipe,bpf" if DEFAULT_SECURITY_TOMOYO
>  	default "landlock,lockdown,yama,loadpin,safesetid,ipe,bpf" if DEFAULT_SECURITY_DAC
> -	default "landlock,lockdown,yama,loadpin,safesetid,selinux,smack,tomoyo,apparmor,ipe,bpf"
> +	default "landlock,lockdown,yama,loadpin,safesetid,selinux,smack,tomoyo,apparmor,ipe,hornet,bpf"
>  	help
>  	  A comma-separated list of LSMs, in initialization order.
>  	  Any LSMs left off this list, except for those with order
> diff --git a/security/Makefile b/security/Makefile
> index 4601230ba442..b68cb56e419b 100644
> --- a/security/Makefile
> +++ b/security/Makefile
> @@ -26,6 +26,7 @@ obj-$(CONFIG_CGROUPS)			+= device_cgroup.o
>  obj-$(CONFIG_BPF_LSM)			+= bpf/
>  obj-$(CONFIG_SECURITY_LANDLOCK)		+= landlock/
>  obj-$(CONFIG_SECURITY_IPE)		+= ipe/
> +obj-$(CONFIG_SECURITY_HORNET)		+= hornet/
>  
>  # Object integrity file lists
>  obj-$(CONFIG_INTEGRITY)			+= integrity/
> diff --git a/security/hornet/Kconfig b/security/hornet/Kconfig
> new file mode 100644
> index 000000000000..19406aa237ac
> --- /dev/null
> +++ b/security/hornet/Kconfig
> @@ -0,0 +1,11 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +config SECURITY_HORNET
> +	bool "Hornet support"
> +	depends on SECURITY
> +	default n
> +	help
> +	  This selects Hornet.
> +	  Further information can be found in
> +	  Documentation/admin-guide/LSM/Hornet.rst.
> +
> +	  If you are unsure how to answer this question, answer N.
> diff --git a/security/hornet/Makefile b/security/hornet/Makefile
> new file mode 100644
> index 000000000000..26b6f954f762
> --- /dev/null
> +++ b/security/hornet/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +obj-$(CONFIG_SECURITY_HORNET) := hornet.o
> +
> +hornet-y := hornet.asn1.o \
> +	hornet_lsm.o \
> +
> +$(obj)/hornet.asn1.o: $(obj)/hornet.asn1.c $(obj)/hornet.asn1.h
> diff --git a/security/hornet/hornet.asn1 b/security/hornet/hornet.asn1
> new file mode 100644
> index 000000000000..c8d47b16b65d
> --- /dev/null
> +++ b/security/hornet/hornet.asn1
> @@ -0,0 +1,13 @@
> +-- SPDX-License-Identifier: BSD-3-Clause
> +--
> +-- Copyright (C) 2009 IETF Trust and the persons identified as authors
> +-- of the code
> +--
> +-- https://www.rfc-editor.org/rfc/rfc5652#section-3
> +
> +HornetData ::= SET OF Map
> +
> +Map ::= SEQUENCE {
> +	index			INTEGER ({ hornet_map_index }),
> +	sha			OCTET STRING ({ hornet_map_hash })
> +} ({ hornet_next_map })
> diff --git a/security/hornet/hornet_lsm.c b/security/hornet/hornet_lsm.c
> new file mode 100644
> index 000000000000..6c821d6441fb
> --- /dev/null
> +++ b/security/hornet/hornet_lsm.c
> @@ -0,0 +1,323 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Hornet Linux Security Module
> + *
> + * Author: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
> + *
> + * Copyright (C) 2026 Microsoft Corporation
> + */
> +
> +#include <linux/lsm_hooks.h>
> +#include <uapi/linux/lsm.h>
> +#include <linux/bpf.h>
> +#include <linux/verification.h>
> +#include <crypto/public_key.h>
> +#include <linux/module_signature.h>
> +#include <crypto/pkcs7.h>
> +#include <linux/sort.h>
> +#include <linux/asn1_decoder.h>
> +#include <linux/oid_registry.h>
> +#include "hornet.asn1.h"
> +
> +#define MAX_USED_MAPS 64
> +
> +struct hornet_maps {
> +	bpfptr_t fd_array;
> +};
> +
> +struct hornet_parse_context {
> +	int indexes[MAX_USED_MAPS];
> +	bool skips[MAX_USED_MAPS];
> +	unsigned char hashes[SHA256_DIGEST_SIZE * MAX_USED_MAPS];
> +	int hash_count;
> +};

I might include a brief comment at near the top of this file referencing
the hash algorithm limitation in the Hornet docs, otherwise someone is
surely going to advocate for hash agility improvements at some point.

> +struct hornet_prog_security_struct {
> +	bool checked[MAX_USED_MAPS];
> +	unsigned char hashes[SHA256_DIGEST_SIZE * MAX_USED_MAPS];
> +};
> +
> +struct hornet_map_security_struct {
> +	bool checked;
> +	int index;
> +};
> +
> +struct lsm_blob_sizes hornet_blob_sizes __ro_after_init = {
> +	.lbs_bpf_map = sizeof(struct hornet_map_security_struct),
> +	.lbs_bpf_prog = sizeof(struct hornet_prog_security_struct),
> +};
> +
> +static inline struct hornet_prog_security_struct *
> +hornet_bpf_prog_security(struct bpf_prog *prog)
> +{
> +	return prog->aux->security + hornet_blob_sizes.lbs_bpf_prog;
> +}
> +
> +static inline struct hornet_map_security_struct *
> +hornet_bpf_map_security(struct bpf_map *map)
> +{
> +	return map->security + hornet_blob_sizes.lbs_bpf_map;
> +}
> +
> +static int hornet_verify_hashes(struct hornet_maps *maps,
> +				struct hornet_parse_context *ctx,
> +				struct bpf_prog *prog)
> +{
> +	int map_fd;
> +	u32 i;
> +	struct bpf_map *map;
> +	int err = 0;
> +	unsigned char hash[SHA256_DIGEST_SIZE];
> +	struct hornet_prog_security_struct *security = hornet_bpf_prog_security(prog);
> +	struct hornet_map_security_struct *map_security;
> +
> +	for (i = 0; i < ctx->hash_count; i++) {
> +		if (ctx->skips[i]) {
> +			security->checked[i] = false;

I'm not going to argue against an explicit false assignement here, but
as a FYI, when the LSM framework allocates the various object blobs it
(re)sets the blob memory to zero via kzalloc().  Even if/when the LSM
framwork moves to some other allocation scheme we will still need to keep
that reset-to-zero behavior.

The same applies to the BPF map blobs.

> +			continue;
> +		}
> +
> +		err = copy_from_bpfptr_offset(&map_fd, maps->fd_array,
> +					      ctx->indexes[i] * sizeof(map_fd),
> +					      sizeof(map_fd));
> +		if (err < 0)
> +			return LSM_INT_VERDICT_BADSIG;
> +
> +		CLASS(fd, f)(map_fd);
> +		if (fd_empty(f))
> +			return LSM_INT_VERDICT_BADSIG;
> +		if (unlikely(fd_file(f)->f_op != &bpf_map_fops))
> +			return LSM_INT_VERDICT_BADSIG;

I'm wondering if it is worth defining a generic LSM_INT_VERDICT_FAULT
verdict to indicate a system error when verifying the integrity rather
than a bad signature.  Yes, the enforcement action will likely be the
same, but it might help when debugging or chasing forensic data.

> +		map = fd_file(f)->private_data;
> +		if (!map->frozen)
> +			return LSM_INT_VERDICT_BADSIG;
> +
> +		map->ops->map_get_hash(map, SHA256_DIGEST_SIZE, hash);
> +
> +		err = memcmp(hash, &ctx->hashes[i * SHA256_DIGEST_SIZE],
> +			      SHA256_DIGEST_SIZE);
> +		if (err)
> +			return LSM_INT_VERDICT_BADSIG;
> +
> +		security->checked[i] = true;
> +		memcpy(&security->hashes[i * SHA256_DIGEST_SIZE], hash, SHA256_DIGEST_SIZE);
> +		map_security = hornet_bpf_map_security(map);
> +		map_security->checked = true;
> +		map_security->index = i;
> +	}
> +	return LSM_INT_VERDICT_OK;
> +}
> +
> +int hornet_next_map(void *context, size_t hdrlen,
> +		     unsigned char tag,
> +		     const void *value, size_t vlen)
> +{
> +	struct hornet_parse_context *ctx = (struct hornet_parse_context *)context;
> +
> +	ctx->hash_count++;

Do we need a check here to ensure that ctx->hash_count doesn't exceed
MAX_USED_MAPS?  If not here, where do we ensure we don't blow past
MAX_USED_MAPS?

What does Hornet do if the number of hashed maps is greater then
MAX_USED_MAPS?  I'm guessing we would want it to return an error and
fail the load?

> +	return 0;
> +}
> +
> +int hornet_map_index(void *context, size_t hdrlen,
> +		     unsigned char tag,
> +		     const void *value, size_t vlen)
> +{
> +	struct hornet_parse_context *ctx = (struct hornet_parse_context *)context;
> +
> +	if (vlen > 1)
> +		return -EINVAL;
> +
> +	ctx->indexes[ctx->hash_count] = *(u8 *)value;
> +	return 0;
> +}
> +
> +int hornet_map_hash(void *context, size_t hdrlen,
> +		    unsigned char tag,
> +		    const void *value, size_t vlen)
> +
> +{
> +	struct hornet_parse_context *ctx = (struct hornet_parse_context *)context;
> +
> +	if (vlen != SHA256_DIGEST_SIZE && vlen != 0)
> +		return -EINVAL;
> +
> +	if (vlen) {
> +		ctx->skips[ctx->hash_count] = false;
> +		memcpy(&ctx->hashes[ctx->hash_count * SHA256_DIGEST_SIZE], value, vlen);
> +	} else
> +		ctx->skips[ctx->hash_count] = true;
> +
> +	return 0;
> +}
> +
> +static int hornet_check_program(struct bpf_prog *prog, union bpf_attr *attr,
> +				struct bpf_token *token, bool is_kernel)
> +{
> +	struct hornet_maps maps = {0};
> +	bpfptr_t usig = make_bpfptr(attr->signature, is_kernel);
> +	struct pkcs7_message *msg;
> +	struct hornet_parse_context *ctx;
> +	void *sig;
> +	int err;
> +	const void *authattrs;
> +	size_t authattrs_len;
> +
> +	if (!attr->signature)
> +		return LSM_INT_VERDICT_UNSIGNED;
> +
> +	ctx = kzalloc(sizeof(struct hornet_parse_context), GFP_KERNEL);
> +	if (!ctx)
> +		return -ENOMEM;

I think I mentioned this previously, but let me repeat myself in case I
didn't ... we don't want to mix LSM_INT_VERDICT enums and errno values
in the return value.  Yes, you can probably get away with it in the
majority of cases, but I worry it is a problem waiting to happen.  I
count only four parameters right now, so adding a verdict enum pointer
shouldn't be too difficult.

> +	maps.fd_array = make_bpfptr(attr->fd_array, is_kernel);
> +	sig = kzalloc(attr->signature_size, GFP_KERNEL);
> +	if (!sig) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +	err = copy_from_bpfptr(sig, usig, attr->signature_size);
> +	if (err != 0)
> +		goto cleanup_sig;
> +
> +	err = verify_pkcs7_signature(prog->insnsi, prog->len * sizeof(struct bpf_insn),
> +				     sig, attr->signature_size, VERIFY_USE_SECONDARY_KEYRING,
> +				     VERIFYING_BPF_SIGNATURE, NULL, NULL);
> +	if (err < 0) {
> +		err = LSM_INT_VERDICT_BADSIG;
> +		goto cleanup_sig;
> +	}
> +
> +	msg = pkcs7_parse_message(sig, attr->signature_size);
> +	if (IS_ERR(msg)) {
> +		err = LSM_INT_VERDICT_BADSIG;
> +		goto cleanup_sig;
> +	}
> +
> +	if (validate_pkcs7_trust(msg, VERIFY_USE_SECONDARY_KEYRING)) {
> +		err = LSM_INT_VERDICT_PARTIALSIG;
> +		goto cleanup_msg;
> +	}
> +	if (pkcs7_get_authattr(msg, OID_hornet_data,
> +			       &authattrs, &authattrs_len) == -ENODATA) {
> +		err = LSM_INT_VERDICT_PARTIALSIG;
> +		goto cleanup_msg;
> +	}
> +
> +	err = asn1_ber_decoder(&hornet_decoder, ctx, authattrs, authattrs_len);
> +	if (err < 0 || authattrs == NULL) {
> +		err = LSM_INT_VERDICT_PARTIALSIG;
> +		goto cleanup_msg;
> +	}
> +	err = hornet_verify_hashes(&maps, ctx, prog);
> +
> +cleanup_msg:
> +	pkcs7_free_message(msg);
> +cleanup_sig:
> +	kfree(sig);
> +out:
> +	kfree(ctx);
> +	return err;
> +}
> +
> +static const struct lsm_id hornet_lsmid = {
> +	.name = "hornet",
> +	.id = LSM_ID_HORNET,
> +};
> +
> +static int hornet_bpf_prog_load_integrity(struct bpf_prog *prog, union bpf_attr *attr,
> +					  struct bpf_token *token, bool is_kernel)
> +{
> +	int result = hornet_check_program(prog, attr, token, is_kernel);

Can you explain a bit why we check for the kernel flag in hornet_bpf(),
but not here?  It may be that a brief comment in hornet_bpf() explaining
the kernel flag exception would be helpful.

> +	if (result < 0)
> +		return result;
> +
> +	return security_bpf_prog_load_post_integrity(prog, attr, token, is_kernel,
> +						     &hornet_lsmid, result);
> +}
> +
> +static int hornet_verify_map(struct bpf_prog *prog, int index)
> +{
> +	unsigned char hash[SHA256_DIGEST_SIZE];
> +	int i;
> +	struct bpf_map *map;
> +	struct hornet_prog_security_struct *security = hornet_bpf_prog_security(prog);
> +	struct hornet_map_security_struct *map_security;
> +
> +	if (!security->checked[index])
> +		return 0;
> +
> +	for (i = 0; i < prog->aux->used_map_cnt; i++) {
> +		map = prog->aux->used_maps[i];
> +		map_security = hornet_bpf_map_security(map);
> +		if (map_security->index != index)
> +			continue;
> +
> +		if (!map->frozen)
> +			return -EINVAL;

Unless there is serious tampering going on we should never see an
unfrozen map here, yes?

We probably also want to use a return value other than -EINVAL as this
is a access/permission denial.  I would think -EACCES or -EPERM would be
more appropriate.

> +		map->ops->map_get_hash(map, SHA256_DIGEST_SIZE, hash);
> +		if (memcmp(hash, &security->hashes[index * SHA256_DIGEST_SIZE],
> +			   SHA256_DIGEST_SIZE) != 0)

Presumably this is just being extra careful?

> +			return -EINVAL;

See above, -EACCES or -EPERM is likely a better choice here.

> +		else
> +			return 0;
> +	}
> +	return -EINVAL;

See above.

> +}
> +
> +static int hornet_check_prog_maps(u32 ufd)
> +{
> +	CLASS(fd, f)(ufd);
> +	struct bpf_prog *prog;
> +	int i, result = 0;
> +
> +	if (fd_empty(f))
> +		return -EBADF;
> +	if (fd_file(f)->f_op != &bpf_prog_fops)
> +		return -EINVAL;
> +
> +	prog = fd_file(f)->private_data;
> +
> +	mutex_lock(&prog->aux->used_maps_mutex);
> +	if (!prog->aux->used_map_cnt)
> +		goto out;
> +
> +	for (i = 0; i < prog->aux->used_map_cnt; i++) {
> +		result = hornet_verify_map(prog, i);
> +		if (result)
> +			goto out;
> +	}
> +out:
> +	mutex_unlock(&prog->aux->used_maps_mutex);
> +	return result;
> +}
> +
> +static int hornet_bpf(int cmd, union bpf_attr *attr, unsigned int size, bool kernel)
> +{
> +	if (cmd != BPF_PROG_RUN)
> +		return 0;
> +	if (kernel)
> +		return 0;
> +
> +	return hornet_check_prog_maps(attr->test.prog_fd);
> +}
> +
> +static struct security_hook_list hornet_hooks[] __ro_after_init = {
> +	LSM_HOOK_INIT(bpf_prog_load_integrity, hornet_bpf_prog_load_integrity),
> +	LSM_HOOK_INIT(bpf, hornet_bpf),
> +};
> +
> +static int __init hornet_init(void)
> +{
> +	pr_info("Hornet: eBPF signature verification enabled\n");
> +	security_add_hooks(hornet_hooks, ARRAY_SIZE(hornet_hooks), &hornet_lsmid);
> +	return 0;
> +}
> +
> +DEFINE_LSM(hornet) = {
> +	.id = &hornet_lsmid,
> +	.blobs = &hornet_blob_sizes,
> +	.init = hornet_init,
> +};
> -- 
> 2.52.0

--
paul-moore.com

^ permalink raw reply

* Re: [PATCH 0/3] Firmware LSM hook
From: Leon Romanovsky @ 2026-03-11 19:16 UTC (permalink / raw)
  To: Paul Moore
  Cc: James Morris, Serge E. Hallyn, Jason Gunthorpe, Saeed Mahameed,
	Itay Avraham, Dave Jiang, Jonathan Cameron, linux-security-module,
	linux-kernel, linux-rdma, Chiara Meiohas, Maher Sanalla,
	Edward Srouji
In-Reply-To: <CAHC9VhR0iuzYRpi3vPdKAbsOJ-DoMvWV-c7TXVcAmb3u8J4JwA@mail.gmail.com>

On Wed, Mar 11, 2026 at 12:06:09PM -0400, Paul Moore wrote:
> On Wed, Mar 11, 2026 at 4:20 AM Leon Romanovsky <leon@kernel.org> wrote:
> > On Tue, Mar 10, 2026 at 05:40:02PM -0400, Paul Moore wrote:
> > > On Tue, Mar 10, 2026 at 3:30 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > On Tue, Mar 10, 2026 at 02:24:40PM -0400, Paul Moore wrote:
> > > > > On Tue, Mar 10, 2026 at 5:07 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > On Mon, Mar 09, 2026 at 07:10:25PM -0400, Paul Moore wrote:
> > > > > > > On Mon, Mar 9, 2026 at 3:37 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > > > On Mon, Mar 09, 2026 at 02:32:39PM -0400, Paul Moore wrote:
> > > > > > > > > On Mon, Mar 9, 2026 at 7:15 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > ...
> > > > >
> > > > > > > > > Hi Leon,
> > > > > > > > >
> > > > > > > > > At the link below, you'll find guidance on submitting new LSM hooks.
> > > > > > > > > Please take a look and let me know if you have any questions.
> > > > > > > > >
> > > > > > > > > https://github.com/LinuxSecurityModule/kernel/blob/main/README.md#new-lsm-hooks
> > > > > > > >
> > > > > > > > I assume that you are referring to this part:
> > > > > > >
> > > > > > > I'm referring to all of the guidance, but yes, at the very least that
> > > > > > > is something that I think we need to see in a future revision of this
> > > > > > > patchset.
> > > > > > >
> > > > > > > >  * New LSM hooks must demonstrate their usefulness by providing a meaningful
> > > > > > > >    implementation for at least one in-kernel LSM. The goal is to demonstrate
> > > > > > > >    the purpose and expected semantics of the hooks. Out of tree kernel code,
> > > > > > > >    and pass through implementations, such as the BPF LSM, are not eligible
> > > > > > > >    for LSM hook reference implementations.
> > > > > > > >
> > > > > > > > The point is that we are not inspecting a kernel call, but the FW mailbox,
> > > > > > > > which has very little meaning to the kernel. From the kernel's perspective,
> > > > > > > > all relevant checks have already been performed, but the existing capability
> > > > > > > > granularity does not allow us to distinguish between FW_CMD1 and FW_CMD2.
> > > > > > >
> > > > > > > It might help if you could phrase this differently, as I'm not
> > > > > > > entirely clear on your argument.  LSMs are not limited to enforcing
> > > > > > > access controls on requests the kernel understands (see the SELinux
> > > > > > > userspace object manager concept), and the idea of access controls
> > > > > > > with greater granularity than capabilities is one of the main reasons
> > > > > > > people look to LSMs for access control (SELinux, AppArmor, Smack,
> > > > > > > etc.).
> > > > > >
> > > > > > I should note that my understanding of LSM is limited, so some parts of my
> > > > > > answers may be inaccurate.
> > > > > >
> > > > > > What I am referring to is a different level of granularity — specifically,
> > > > > > the internals of the firmware commands. In the proposed approach, BPF
> > > > > > programs would make decisions based on data passed through the mailbox.
> > > > > > That mailbox format varies across vendors, and may even differ between
> > > > > > firmware versions from the same vendor.
> > > > >
> > > > > That helps, thank you.
> > > > >
> > > > > > > > Here we propose a generic interface that can be applied to all FWCTL
> > > > > > > > devices without out-of-tree kernel code at all.
> > > > > > >
> > > > > > > I expected to see a patch implementing some meaningful support for
> > > > > > > access controls using these hooks in one of the existing LSMs, I did
> > > > > > > not see that in this patchset.
> > > > > >
> > > > > > In some cases, the mailbox is forwarded from user space unchanged, but
> > > > > > in others the kernel modifies it before submitting it to the FW.
> > > > >
> > > > > Without a standard format, opcode definitions, etc. I suspect
> > > > > integrating this into an LSM will present a number of challenges.
> > > >
> > > > The opcode is relatively easy to extract from the mailbox and pass to the LSM.
> > > > All drivers implement some variant of mlx5ctl_validate_rpc()/devx_is_general_cmd()
> > > > to validate the opcode. The problem is that this check alone is not sufficient.
> > > >
> > > > > Instead of performing an LSM access control check before submitting
> > > > > the firmware command, it might be easier from an LSM perspective to
> > > > > have the firmware call into the kernel/LSM for an access control
> > > > > decision before performing a security-relevant action.
> > > >
> > > > Ultimately, the LSM must make a decision for each executed firmware
> > > > command. This will need to be handled one way or another, and will
> > > > likely require parsing the mailbox again.
> > >
> > > As it's unlikely that parsing the mailbox is something that a LSM will
> > > want to handle,
> >
> > I believe this approach offers the cleanest and most natural way to support
> > all mailbox‑based devices.
> >
> > > my suggestion was to leverage the existing mailbox parsing in the firmware
> > > and require the firmware to call into the LSM when authorization is needed.
> > >
> > > > > This removes the challenge of parsing/interpreting the arbitrary firmware commands,
> > > > > but it does add some additional complexity of having to generically
> > > > > represent the security relevant actions the firmware might request
> > > >
> > > > The difference here is that the proposed LSM hook is intended to disable
> > > > certain functionality provided by the firmware, effectively depending on
> > > > the operator’s preferences.
> > >
> > > My suggestion would also allow a LSM hook to disable certain firmware
> > > functionality; however, the firmware itself would need to call the LSM
> > > to check if the functionality is authorized.
> >
> > This suggestion adds an extra call from the FW to the LSM for every command, even
> > for systems which don't have LSM at all.
> 
> If latency is a concern, I imagine we could create an LSM hook to
> report whether any LSMs provided firmware access controls.  The
> firmware could then use that hook, potentially caching the result, to
> limit its calls into the LSM.
> 
> > The FW must pass the already parsed data
> > back to the LSM; otherwise, the LSM   has no basis to decide whether to accept or
> > reject the request.
> >
> > For example, consider the MLX5_CMD_OP_QUERY_DCT command handled in
> > mlx5ctl_validate_rpc(). DCT in RDMA refers to Dynamically Connected
> > Transport, a Mellanox-specific extension that effectively introduces a new
> > QP‑type family on top of the standard RC/UC/UD transports. This type does not
> > exist for other vendors, each of whom provides its own vendor‑specific
> > extensions. All parameters here are tightly coupled to those specific
> > commands.
> >
> > It is unrealistic to expect different firmware implementations to supply
> > their data in a common format that would allow the LSM to make a generic
> > decision.
> 
> That's unfortunate as that would be the easiest path forward.
> Regardless, you are welcome to work on whatever implementation you
> think makes sense for any of the in-tree LSMs, with that in place we
> can take another look at the firmware command hooks.
> 
> Good luck.

I'll take advantage of the upcoming weekend and look into what can be done
here.

Thanks

> 
> -- 
> paul-moore.com

^ permalink raw reply

* [PATCH v3 3/3] ima: Add support for staging measurements for deletion
From: Roberto Sassu @ 2026-03-11 17:19 UTC (permalink / raw)
  To: corbet, skhan, zohar, dmitry.kasatkin, eric.snowberg, paul,
	jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, chenste, nramas, Roberto Sassu
In-Reply-To: <20260311171956.2317781-1-roberto.sassu@huaweicloud.com>

From: Roberto Sassu <roberto.sassu@huawei.com>

Introduce the ability of staging the IMA measurement list for deletion.
Staging means moving the current content of the measurement list to a
separate location, and allowing users to read and delete it. This causes
the measurement list to be atomically truncated before new measurements can
be added. Staging can be done only once at a time. In the event of kexec(),
staging is reverted and staged entries will be carried over to the new
kernel.

Staged measurements can be deleted entirely, or partially, with the
non-deleted ones added back to the IMA measurements list. This allows the
remote attestation agents to easily separate the measurements that where
verified (staged and deleted) from those that weren't due to the race
between taking a TPM quote and reading the measurements list.

User space is responsible to concatenate the staged IMA measurements list
portions (excluding the measurements added back to the IMA measurements
list) following the temporal order in which the operations were done,
together with the current measurement list. Then, it can send the collected
data to the remote verifiers.

The benefit of staging and deleting is the ability to free precious kernel
memory, in exchange of delegating user space to reconstruct the full
measurement list from the chunks. No trust needs to be given to user space,
since the integrity of the measurement list is protected by the TPM.

By default, staging the measurements list does not alter the hash table.
When staging and deleting are done, IMA is still able to detect collisions
on the staged and later deleted measurement entries, by keeping the entry
digests (only template data are freed).

However, since during the measurements list serialization only the SHA1
digest is passed, and since there are no template data to recalculate the
other digests from, the hash table is currently not populated with digests
from staged/deleted entries after kexec().

Introduce the new kernel option ima_flush_htable to decide whether or not
the digests of staged measurement entries are flushed from the hash table,
when they are deleted. Flushing the hash table is supported only when
deleting all the staged measurements, since in that case the old hash table
can be quickly swapped with a blank one (otherwise entries would have to be
removed one by one for partial deletion).

Then, introduce ascii_runtime_measurements_<algo>_staged and
binary_runtime_measurements_<algo>_staged interfaces to stage and delete
the measurements. Use 'echo A > <IMA interface>' and
'echo D > <IMA interface>' to respectively stage and delete the entire
measurements list. Use 'echo N > <IMA interface>', with N between 1 and
ULONG_MAX - 1, to delete the selected staged portion of the measurements
list.

The ima_measure_users counter (protected by the ima_measure_mutex mutex)
has been introduced to protect access to the measurements list and the
staged part. The open method of all the measurement interfaces has been
extended to allow only one writer at a time or, in alternative, multiple
readers. The write permission is used to stage and delete the measurements,
the read permission to read them. Write requires also the CAP_SYS_ADMIN
capability.

Finally, introduce the binary_lists enum and make binary_runtime_size
and ima_num_entries as arrays, to keep track of their values for the
current IMA measurements list (BINARY), current list plus staged
measurements (BINARY_STAGED) and the cumulative list since IMA
initialization (BINARY_FULL).

Use BINARY in ima_show_measurements_count(), BINARY_STAGED in
ima_add_kexec_buffer() and BINARY_FULL in ima_measure_kexec_event().

It should be noted that the BINARY_FULL counter is not passed through
kexec. Thus, the number of entries included in the kexec critical data
records refers to the entries since the previous kexec records.

Note: This code derives from the Alt-IMA Huawei project, whose license is
      GPL-2.0 OR MIT.

Link: https://github.com/linux-integrity/linux/issues/1
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
Changelog
v2:
 - Forbid partial deletion when flushing hash table (suggested by Mimi)
 - Ignore ima_flush_htable if CONFIG_IMA_DISABLE_HTABLE is enabled
 - BINARY_SIZE_* renamed to BINARY_* for better clarity
 - Removed ima_measurements_staged_exist and testing list empty instead
 - ima_queue_stage_trim() and ima_queue_delete_staged_trimmed() renamed to
   ima_queue_stage() and ima_queue_delete_staged()
 - New delete interval [1, ULONG_MAX - 1]
 - Rename ima_measure_lock to ima_measure_mutex
 - Move seq_open() and seq_release() outside the ima_measure_mutex lock
 - Drop ima_measurements_staged_read() and use seq_read() instead
 - Optimize create_securityfs_measurement_lists() changes
 - New file name format with _staged suffix at the end of the file name
 - Use _rcu list variant in ima_dump_measurement_list()
 - Remove support for direct trimming and splice the remaining entries to
   the active list (suggested by Mimi)
 - Hot swap the hash table if flushing is requested

v1:
 - Support for direct trimming without staging
 - Support unstaging on kexec (requested by Gregory Lumen)
---
 .../admin-guide/kernel-parameters.txt         |   4 +
 security/integrity/ima/ima.h                  |  17 +-
 security/integrity/ima/ima_fs.c               | 266 ++++++++++++++++--
 security/integrity/ima/ima_kexec.c            |  43 ++-
 security/integrity/ima/ima_queue.c            | 205 +++++++++++++-
 5 files changed, 484 insertions(+), 51 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index cb850e5290c2..7a377812aa0a 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2345,6 +2345,10 @@ Kernel parameters
 			Use the canonical format for the binary runtime
 			measurements, instead of host native format.
 
+	ima_flush_htable  [IMA]
+			Flush the IMA hash table when deleting all the
+			staged measurement entries.
+
 	ima_hash=	[IMA]
 			Format: { md5 | sha1 | rmd160 | sha256 | sha384
 				   | sha512 | ... }
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index ccd037d49de7..e8aaf1e62139 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -28,6 +28,15 @@ enum ima_show_type { IMA_SHOW_BINARY, IMA_SHOW_BINARY_NO_FIELD_LEN,
 		     IMA_SHOW_BINARY_OLD_STRING_FMT, IMA_SHOW_ASCII };
 enum tpm_pcrs { TPM_PCR0 = 0, TPM_PCR8 = 8, TPM_PCR10 = 10 };
 
+/*
+ * BINARY: current binary measurements list
+ * BINARY_STAGED: current binary measurements list + staged entries
+ * BINARY_FULL: binary measurements list since IMA init (lost after kexec)
+ */
+enum binary_lists {
+	BINARY, BINARY_STAGED, BINARY_FULL, BINARY__LAST
+};
+
 /* digest size for IMA, fits SHA1 or MD5 */
 #define IMA_DIGEST_SIZE		SHA1_DIGEST_SIZE
 #define IMA_EVENT_NAME_LEN_MAX	255
@@ -118,6 +127,7 @@ struct ima_queue_entry {
 	struct ima_template_entry *entry;
 };
 extern struct list_head ima_measurements;	/* list of all measurements */
+extern struct list_head ima_measurements_staged; /* list of staged meas. */
 
 /* Some details preceding the binary serialized measurement list */
 struct ima_kexec_hdr {
@@ -282,11 +292,13 @@ struct ima_template_desc *ima_template_desc_current(void);
 struct ima_template_desc *ima_template_desc_buf(void);
 struct ima_template_desc *lookup_template_desc(const char *name);
 bool ima_template_has_modsig(const struct ima_template_desc *ima_template);
+int ima_queue_stage(void);
+int ima_queue_delete_staged(unsigned long req_value);
 int ima_restore_measurement_entry(struct ima_template_entry *entry);
 int ima_restore_measurement_list(loff_t bufsize, void *buf);
 int ima_measurements_show(struct seq_file *m, void *v);
 int __init ima_init_htable(void);
-unsigned long ima_get_binary_runtime_size(void);
+unsigned long ima_get_binary_runtime_size(enum binary_lists binary_list);
 int ima_init_template(void);
 void ima_init_template_list(void);
 int __init ima_init_digests(void);
@@ -299,9 +311,10 @@ int ima_lsm_policy_change(struct notifier_block *nb, unsigned long event,
  */
 extern spinlock_t ima_queue_lock;
 
-extern atomic_long_t ima_num_entries;
+extern atomic_long_t ima_num_entries[BINARY__LAST];
 extern atomic_long_t ima_num_violations;
 extern struct hlist_head __rcu *ima_htable;
+extern struct mutex ima_extend_list_mutex;
 
 static inline unsigned int ima_hash_key(u8 *digest)
 {
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index aaa460d70ff7..cf85b0892275 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -24,7 +24,17 @@
 
 #include "ima.h"
 
+/*
+ * Requests:
+ * 'A\n': stage the entire measurements list
+ * 'D\n': delete all staged measurements
+ * '[1, ULONG_MAX - 1]\n' delete N measurements entries and unstage the rest
+ */
+#define STAGED_REQ_LENGTH 21
+
 static DEFINE_MUTEX(ima_write_mutex);
+static DEFINE_MUTEX(ima_measure_mutex);
+static long ima_measure_users;
 
 bool ima_canonical_fmt;
 static int __init default_canonical_fmt_setup(char *str)
@@ -63,7 +73,7 @@ static ssize_t ima_show_measurements_count(struct file *filp,
 					   char __user *buf,
 					   size_t count, loff_t *ppos)
 {
-	return ima_show_counter(buf, count, ppos, &ima_num_entries);
+	return ima_show_counter(buf, count, ppos, &ima_num_entries[BINARY]);
 
 }
 
@@ -73,14 +83,15 @@ static const struct file_operations ima_measurements_count_ops = {
 };
 
 /* returns pointer to hlist_node */
-static void *ima_measurements_start(struct seq_file *m, loff_t *pos)
+static void *_ima_measurements_start(struct seq_file *m, loff_t *pos,
+				     struct list_head *head)
 {
 	loff_t l = *pos;
 	struct ima_queue_entry *qe;
 
 	/* we need a lock since pos could point beyond last element */
 	rcu_read_lock();
-	list_for_each_entry_rcu(qe, &ima_measurements, later) {
+	list_for_each_entry_rcu(qe, head, later) {
 		if (!l--) {
 			rcu_read_unlock();
 			return qe;
@@ -90,7 +101,18 @@ static void *ima_measurements_start(struct seq_file *m, loff_t *pos)
 	return NULL;
 }
 
-static void *ima_measurements_next(struct seq_file *m, void *v, loff_t *pos)
+static void *ima_measurements_start(struct seq_file *m, loff_t *pos)
+{
+	return _ima_measurements_start(m, pos, &ima_measurements);
+}
+
+static void *ima_measurements_staged_start(struct seq_file *m, loff_t *pos)
+{
+	return _ima_measurements_start(m, pos, &ima_measurements_staged);
+}
+
+static void *_ima_measurements_next(struct seq_file *m, void *v, loff_t *pos,
+				    struct list_head *head)
 {
 	struct ima_queue_entry *qe = v;
 
@@ -102,7 +124,18 @@ static void *ima_measurements_next(struct seq_file *m, void *v, loff_t *pos)
 	rcu_read_unlock();
 	(*pos)++;
 
-	return (&qe->later == &ima_measurements) ? NULL : qe;
+	return (&qe->later == head) ? NULL : qe;
+}
+
+static void *ima_measurements_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	return _ima_measurements_next(m, v, pos, &ima_measurements);
+}
+
+static void *ima_measurements_staged_next(struct seq_file *m, void *v,
+					  loff_t *pos)
+{
+	return _ima_measurements_next(m, v, pos, &ima_measurements_staged);
 }
 
 static void ima_measurements_stop(struct seq_file *m, void *v)
@@ -198,16 +231,145 @@ static const struct seq_operations ima_measurments_seqops = {
 	.show = ima_measurements_show
 };
 
+static int ima_measure_lock(bool write)
+{
+	mutex_lock(&ima_measure_mutex);
+	if ((write && ima_measure_users != 0) ||
+	    (!write && ima_measure_users < 0)) {
+		mutex_unlock(&ima_measure_mutex);
+		return -EBUSY;
+	}
+
+	if (write)
+		ima_measure_users--;
+	else
+		ima_measure_users++;
+	mutex_unlock(&ima_measure_mutex);
+	return 0;
+}
+
+static void ima_measure_unlock(bool write)
+{
+	mutex_lock(&ima_measure_mutex);
+	if (write)
+		ima_measure_users++;
+	else
+		ima_measure_users--;
+	mutex_unlock(&ima_measure_mutex);
+}
+
+static int _ima_measurements_open(struct inode *inode, struct file *file,
+				  const struct seq_operations *seq_ops)
+{
+	bool write = (file->f_mode & FMODE_WRITE);
+	int ret;
+
+	if (write && !capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	ret = ima_measure_lock(write);
+	if (ret < 0)
+		return ret;
+
+	ret = seq_open(file, seq_ops);
+	if (ret < 0)
+		ima_measure_unlock(write);
+
+	return ret;
+}
+
 static int ima_measurements_open(struct inode *inode, struct file *file)
 {
-	return seq_open(file, &ima_measurments_seqops);
+	return _ima_measurements_open(inode, file, &ima_measurments_seqops);
+}
+
+static int ima_measurements_release(struct inode *inode, struct file *file)
+{
+	bool write = (file->f_mode & FMODE_WRITE);
+	int ret;
+
+	ret = seq_release(inode, file);
+
+	ima_measure_unlock(write);
+
+	return ret;
 }
 
 static const struct file_operations ima_measurements_ops = {
 	.open = ima_measurements_open,
 	.read = seq_read,
 	.llseek = seq_lseek,
-	.release = seq_release,
+	.release = ima_measurements_release,
+};
+
+static const struct seq_operations ima_measurments_staged_seqops = {
+	.start = ima_measurements_staged_start,
+	.next = ima_measurements_staged_next,
+	.stop = ima_measurements_stop,
+	.show = ima_measurements_show
+};
+
+static int ima_measurements_staged_open(struct inode *inode, struct file *file)
+{
+	return _ima_measurements_open(inode, file,
+				      &ima_measurments_staged_seqops);
+}
+
+static ssize_t ima_measurements_staged_write(struct file *file,
+					     const char __user *buf,
+					     size_t datalen, loff_t *ppos)
+{
+	char req[STAGED_REQ_LENGTH];
+	unsigned long req_value;
+	int ret;
+
+	if (*ppos > 0 || datalen < 2 || datalen > STAGED_REQ_LENGTH)
+		return -EINVAL;
+
+	if (copy_from_user(req, buf, datalen) != 0)
+		return -EFAULT;
+
+	if (req[datalen - 1] != '\n')
+		return -EINVAL;
+
+	req[datalen - 1] = '\0';
+
+	switch (req[0]) {
+	case 'A':
+		if (datalen != 2)
+			return -EINVAL;
+
+		ret = ima_queue_stage();
+		break;
+	case 'D':
+		if (datalen != 2)
+			return -EINVAL;
+
+		ret = ima_queue_delete_staged(ULONG_MAX);
+		break;
+	default:
+		ret = kstrtoul(req, 10, &req_value);
+		if (ret < 0)
+			return ret;
+
+		if (req_value == ULONG_MAX)
+			return -ERANGE;
+
+		ret = ima_queue_delete_staged(req_value);
+	}
+
+	if (ret < 0)
+		return ret;
+
+	return datalen;
+}
+
+static const struct file_operations ima_measurements_staged_ops = {
+	.open = ima_measurements_staged_open,
+	.read = seq_read,
+	.write = ima_measurements_staged_write,
+	.llseek = seq_lseek,
+	.release = ima_measurements_release,
 };
 
 void ima_print_digest(struct seq_file *m, u8 *digest, u32 size)
@@ -272,14 +434,37 @@ static const struct seq_operations ima_ascii_measurements_seqops = {
 
 static int ima_ascii_measurements_open(struct inode *inode, struct file *file)
 {
-	return seq_open(file, &ima_ascii_measurements_seqops);
+	return _ima_measurements_open(inode, file,
+				      &ima_ascii_measurements_seqops);
 }
 
 static const struct file_operations ima_ascii_measurements_ops = {
 	.open = ima_ascii_measurements_open,
 	.read = seq_read,
 	.llseek = seq_lseek,
-	.release = seq_release,
+	.release = ima_measurements_release,
+};
+
+static const struct seq_operations ima_ascii_measurements_staged_seqops = {
+	.start = ima_measurements_staged_start,
+	.next = ima_measurements_staged_next,
+	.stop = ima_measurements_stop,
+	.show = ima_ascii_measurements_show
+};
+
+static int ima_ascii_measurements_staged_open(struct inode *inode,
+					      struct file *file)
+{
+	return _ima_measurements_open(inode, file,
+				      &ima_ascii_measurements_staged_seqops);
+}
+
+static const struct file_operations ima_ascii_measurements_staged_ops = {
+	.open = ima_ascii_measurements_staged_open,
+	.read = seq_read,
+	.write = ima_measurements_staged_write,
+	.llseek = seq_lseek,
+	.release = ima_measurements_release,
 };
 
 static ssize_t ima_read_policy(char *path)
@@ -385,10 +570,21 @@ static const struct seq_operations ima_policy_seqops = {
 };
 #endif
 
-static int __init create_securityfs_measurement_lists(void)
+static int __init create_securityfs_measurement_lists(bool staging)
 {
+	const struct file_operations *ascii_ops = &ima_ascii_measurements_ops;
+	const struct file_operations *binary_ops = &ima_measurements_ops;
+	mode_t permissions = S_IRUSR | S_IRGRP;
+	const char *file_suffix = "";
 	int count = NR_BANKS(ima_tpm_chip);
 
+	if (staging) {
+		ascii_ops = &ima_ascii_measurements_staged_ops;
+		binary_ops = &ima_measurements_staged_ops;
+		file_suffix = "_staged";
+		permissions |= S_IWUSR | S_IWGRP;
+	}
+
 	if (ima_sha1_idx >= NR_BANKS(ima_tpm_chip))
 		count++;
 
@@ -398,26 +594,33 @@ static int __init create_securityfs_measurement_lists(void)
 		struct dentry *dentry;
 
 		if (algo == HASH_ALGO__LAST)
-			sprintf(file_name, "ascii_runtime_measurements_tpm_alg_%x",
-				ima_tpm_chip->allocated_banks[i].alg_id);
+			snprintf(file_name, sizeof(file_name),
+				 "ascii_runtime_measurements_tpm_alg_%x%s",
+				 ima_tpm_chip->allocated_banks[i].alg_id,
+				 file_suffix);
 		else
-			sprintf(file_name, "ascii_runtime_measurements_%s",
-				hash_algo_name[algo]);
-		dentry = securityfs_create_file(file_name, S_IRUSR | S_IRGRP,
+			snprintf(file_name, sizeof(file_name),
+				 "ascii_runtime_measurements_%s%s",
+				 hash_algo_name[algo], file_suffix);
+		dentry = securityfs_create_file(file_name, permissions,
 						ima_dir, (void *)(uintptr_t)i,
-						&ima_ascii_measurements_ops);
+						ascii_ops);
 		if (IS_ERR(dentry))
 			return PTR_ERR(dentry);
 
 		if (algo == HASH_ALGO__LAST)
-			sprintf(file_name, "binary_runtime_measurements_tpm_alg_%x",
-				ima_tpm_chip->allocated_banks[i].alg_id);
+			snprintf(file_name, sizeof(file_name),
+				 "binary_runtime_measurements_tpm_alg_%x%s",
+				 ima_tpm_chip->allocated_banks[i].alg_id,
+				 file_suffix);
 		else
-			sprintf(file_name, "binary_runtime_measurements_%s",
-				hash_algo_name[algo]);
-		dentry = securityfs_create_file(file_name, S_IRUSR | S_IRGRP,
+			snprintf(file_name, sizeof(file_name),
+				 "binary_runtime_measurements_%s%s",
+				 hash_algo_name[algo], file_suffix);
+
+		dentry = securityfs_create_file(file_name, permissions,
 						ima_dir, (void *)(uintptr_t)i,
-						&ima_measurements_ops);
+						binary_ops);
 		if (IS_ERR(dentry))
 			return PTR_ERR(dentry);
 	}
@@ -517,7 +720,10 @@ int __init ima_fs_init(void)
 		goto out;
 	}
 
-	ret = create_securityfs_measurement_lists();
+	ret = create_securityfs_measurement_lists(false);
+	if (ret == 0)
+		ret = create_securityfs_measurement_lists(true);
+
 	if (ret != 0)
 		goto out;
 
@@ -535,6 +741,20 @@ int __init ima_fs_init(void)
 		goto out;
 	}
 
+	dentry = securityfs_create_symlink("binary_runtime_measurements_staged",
+		ima_dir, "binary_runtime_measurements_sha1_staged", NULL);
+	if (IS_ERR(dentry)) {
+		ret = PTR_ERR(dentry);
+		goto out;
+	}
+
+	dentry = securityfs_create_symlink("ascii_runtime_measurements_staged",
+		ima_dir, "ascii_runtime_measurements_sha1_staged", NULL);
+	if (IS_ERR(dentry)) {
+		ret = PTR_ERR(dentry);
+		goto out;
+	}
+
 	dentry = securityfs_create_file("runtime_measurements_count",
 				   S_IRUSR | S_IRGRP, ima_dir, NULL,
 				   &ima_measurements_count_ops);
diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
index 5801649fbbef..70ee3a039df2 100644
--- a/security/integrity/ima/ima_kexec.c
+++ b/security/integrity/ima/ima_kexec.c
@@ -42,8 +42,8 @@ void ima_measure_kexec_event(const char *event_name)
 	long len;
 	int n;
 
-	buf_size = ima_get_binary_runtime_size();
-	len = atomic_long_read(&ima_num_entries);
+	buf_size = ima_get_binary_runtime_size(BINARY_FULL);
+	len = atomic_long_read(&ima_num_entries[BINARY_FULL]);
 
 	n = scnprintf(ima_kexec_event, IMA_KEXEC_EVENT_LEN,
 		      "kexec_segment_size=%lu;ima_binary_runtime_size=%lu;"
@@ -80,6 +80,17 @@ static int ima_alloc_kexec_file_buf(size_t segment_size)
 	return 0;
 }
 
+static int ima_dump_measurement(struct ima_kexec_hdr *khdr,
+				struct ima_queue_entry *qe)
+{
+	if (ima_kexec_file.count >= ima_kexec_file.size)
+		return -EINVAL;
+
+	khdr->count++;
+	ima_measurements_show(&ima_kexec_file, qe);
+	return 0;
+}
+
 static int ima_dump_measurement_list(unsigned long *buffer_size, void **buffer,
 				     unsigned long segment_size)
 {
@@ -95,17 +106,26 @@ static int ima_dump_measurement_list(unsigned long *buffer_size, void **buffer,
 
 	memset(&khdr, 0, sizeof(khdr));
 	khdr.version = 1;
-	/* This is an append-only list, no need to hold the RCU read lock */
-	list_for_each_entry_rcu(qe, &ima_measurements, later, true) {
-		if (ima_kexec_file.count < ima_kexec_file.size) {
-			khdr.count++;
-			ima_measurements_show(&ima_kexec_file, qe);
-		} else {
-			ret = -EINVAL;
+	/* It can race with ima_queue_stage() and ima_queue_delete_staged(). */
+	mutex_lock(&ima_extend_list_mutex);
+
+	list_for_each_entry_rcu(qe, &ima_measurements_staged, later,
+				lockdep_is_held(&ima_extend_list_mutex)) {
+		ret = ima_dump_measurement(&khdr, qe);
+		if (ret < 0)
 			break;
-		}
 	}
 
+	list_for_each_entry_rcu(qe, &ima_measurements, later,
+				lockdep_is_held(&ima_extend_list_mutex)) {
+		if (!ret)
+			ret = ima_dump_measurement(&khdr, qe);
+		if (ret < 0)
+			break;
+	}
+
+	mutex_unlock(&ima_extend_list_mutex);
+
 	/*
 	 * fill in reserved space with some buffer details
 	 * (eg. version, buffer size, number of measurements)
@@ -159,7 +179,8 @@ void ima_add_kexec_buffer(struct kimage *image)
 	else
 		extra_memory = CONFIG_IMA_KEXEC_EXTRA_MEMORY_KB * 1024;
 
-	binary_runtime_size = ima_get_binary_runtime_size() + extra_memory;
+	binary_runtime_size = ima_get_binary_runtime_size(BINARY_STAGED) +
+			      extra_memory;
 
 	if (binary_runtime_size >= ULONG_MAX - PAGE_SIZE)
 		kexec_segment_size = ULONG_MAX;
diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
index 2050b9d21e70..08cd60fa959e 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -22,29 +22,48 @@
 
 #define AUDIT_CAUSE_LEN_MAX 32
 
+bool ima_flush_htable;
+static int __init ima_flush_htable_setup(char *str)
+{
+	if (IS_ENABLED(CONFIG_IMA_DISABLE_HTABLE)) {
+		pr_warn("Hash table not enabled, ignoring request to flush\n");
+		return 1;
+	}
+
+	ima_flush_htable = true;
+	return 1;
+}
+__setup("ima_flush_htable", ima_flush_htable_setup);
+
 /* pre-allocated array of tpm_digest structures to extend a PCR */
 static struct tpm_digest *digests;
 
 LIST_HEAD(ima_measurements);	/* list of all measurements */
+LIST_HEAD(ima_measurements_staged); /* list of staged measurements */
 #ifdef CONFIG_IMA_KEXEC
-static unsigned long binary_runtime_size;
+static unsigned long binary_runtime_size[BINARY__LAST];
 #else
-static unsigned long binary_runtime_size = ULONG_MAX;
+static unsigned long binary_runtime_size[BINARY__LAST] = {
+	[0 ... BINARY__LAST - 1] = ULONG_MAX
+};
 #endif
 
 /* num of stored meas. in the list */
-atomic_long_t ima_num_entries = ATOMIC_LONG_INIT(0);
+atomic_long_t ima_num_entries[BINARY__LAST] = {
+	[0 ... BINARY__LAST - 1] = ATOMIC_LONG_INIT(0)
+};
+
 /* num of violations in the list */
 atomic_long_t ima_num_violations = ATOMIC_LONG_INIT(0);
 
 /* key: inode (before secure-hashing a file) */
 struct hlist_head __rcu *ima_htable;
 
-/* mutex protects atomicity of extending measurement list
+/* mutex protects atomicity of extending and staging measurement list
  * and extending the TPM PCR aggregate. Since tpm_extend can take
  * long (and the tpm driver uses a mutex), we can't use the spinlock.
  */
-static DEFINE_MUTEX(ima_extend_list_mutex);
+DEFINE_MUTEX(ima_extend_list_mutex);
 
 /*
  * Used internally by the kernel to suspend measurements.
@@ -140,7 +159,7 @@ static int ima_add_digest_entry(struct ima_template_entry *entry,
 {
 	struct ima_queue_entry *qe;
 	struct hlist_head *htable;
-	unsigned int key;
+	unsigned int key, i;
 
 	qe = kmalloc_obj(*qe);
 	if (qe == NULL) {
@@ -155,19 +174,25 @@ static int ima_add_digest_entry(struct ima_template_entry *entry,
 	htable = rcu_dereference_protected(ima_htable,
 				lockdep_is_held(&ima_extend_list_mutex));
 
-	atomic_long_inc(&ima_num_entries);
+	for (i = 0; i < BINARY__LAST; i++)
+		atomic_long_inc(&ima_num_entries[i]);
+
 	if (update_htable) {
 		key = ima_hash_key(entry->digests[ima_hash_algo_idx].digest);
 		hlist_add_head_rcu(&qe->hnext, &htable[key]);
 	}
 
-	if (binary_runtime_size != ULONG_MAX) {
-		int size;
+	for (i = 0; i < BINARY__LAST; i++) {
+		if (binary_runtime_size[i] != ULONG_MAX) {
+			int size;
 
-		size = get_binary_runtime_size(entry);
-		binary_runtime_size = (binary_runtime_size < ULONG_MAX - size) ?
-		     binary_runtime_size + size : ULONG_MAX;
+			size = get_binary_runtime_size(entry);
+			binary_runtime_size[i] =
+				(binary_runtime_size[i] < ULONG_MAX - size) ?
+				binary_runtime_size[i] + size : ULONG_MAX;
+		}
 	}
+
 	return 0;
 }
 
@@ -176,12 +201,18 @@ static int ima_add_digest_entry(struct ima_template_entry *entry,
  * entire binary_runtime_measurement list, including the ima_kexec_hdr
  * structure.
  */
-unsigned long ima_get_binary_runtime_size(void)
+unsigned long ima_get_binary_runtime_size(enum binary_lists binary_list)
 {
-	if (binary_runtime_size >= (ULONG_MAX - sizeof(struct ima_kexec_hdr)))
+	unsigned long val;
+
+	mutex_lock(&ima_extend_list_mutex);
+	val = binary_runtime_size[binary_list];
+	mutex_unlock(&ima_extend_list_mutex);
+
+	if (val >= (ULONG_MAX - sizeof(struct ima_kexec_hdr)))
 		return ULONG_MAX;
 	else
-		return binary_runtime_size + sizeof(struct ima_kexec_hdr);
+		return val + sizeof(struct ima_kexec_hdr);
 }
 
 static int ima_pcr_extend(struct tpm_digest *digests_arg, int pcr)
@@ -262,6 +293,150 @@ int ima_add_template_entry(struct ima_template_entry *entry, int violation,
 	return result;
 }
 
+int ima_queue_stage(void)
+{
+	int ret = 0;
+
+	mutex_lock(&ima_extend_list_mutex);
+	if (!list_empty(&ima_measurements_staged)) {
+		ret = -EEXIST;
+		goto out_unlock;
+	}
+
+	if (list_empty(&ima_measurements)) {
+		ret = -ENOENT;
+		goto out_unlock;
+	}
+
+	list_replace(&ima_measurements, &ima_measurements_staged);
+	INIT_LIST_HEAD(&ima_measurements);
+	atomic_long_set(&ima_num_entries[BINARY], 0);
+	if (IS_ENABLED(CONFIG_IMA_KEXEC))
+		binary_runtime_size[BINARY] = 0;
+out_unlock:
+	mutex_unlock(&ima_extend_list_mutex);
+	return ret;
+}
+
+int ima_queue_delete_staged(unsigned long req_value)
+{
+	unsigned long req_value_copy = req_value;
+	unsigned long size_to_remove = 0, num_to_remove = 0;
+	struct ima_queue_entry *qe, *qe_tmp;
+	struct list_head *cut_pos = NULL;
+	LIST_HEAD(ima_measurements_trim);
+	struct hlist_head *old_queue = NULL;
+	unsigned int i;
+
+	if (req_value == 0) {
+		pr_err("Must delete at least one entry\n");
+		return -EINVAL;
+	}
+
+	if (req_value < ULONG_MAX && ima_flush_htable) {
+		pr_err("Deleting staged N measurements not supported when flushing the hash table is requested\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * Safe walk (no concurrent write), not under ima_extend_list_mutex
+	 * for performance reasons.
+	 */
+	list_for_each_entry(qe, &ima_measurements_staged, later) {
+		size_to_remove += get_binary_runtime_size(qe->entry);
+		num_to_remove++;
+
+		if (req_value < ULONG_MAX && --req_value_copy == 0) {
+			/* qe->later always points to a valid list entry. */
+			cut_pos = &qe->later;
+			break;
+		}
+	}
+
+	if (req_value < ULONG_MAX && req_value_copy > 0)
+		return -ENOENT;
+
+	mutex_lock(&ima_extend_list_mutex);
+	if (list_empty(&ima_measurements_staged)) {
+		mutex_unlock(&ima_extend_list_mutex);
+		return -ENOENT;
+	}
+
+	if (req_value < ULONG_MAX) {
+		/*
+		 * ima_dump_measurement_list() does not modify the list,
+		 * cut_pos remains the same even if it was computed before
+		 * the lock.
+		 */
+		__list_cut_position(&ima_measurements_trim,
+				    &ima_measurements_staged, cut_pos);
+	} else {
+		list_replace(&ima_measurements_staged, &ima_measurements_trim);
+		INIT_LIST_HEAD(&ima_measurements_staged);
+	}
+
+	atomic_long_sub(num_to_remove, &ima_num_entries[BINARY_STAGED]);
+	atomic_long_add(atomic_long_read(&ima_num_entries[BINARY_STAGED]),
+			&ima_num_entries[BINARY]);
+	atomic_long_set(&ima_num_entries[BINARY_STAGED],
+			atomic_long_read(&ima_num_entries[BINARY]));
+
+	if (IS_ENABLED(CONFIG_IMA_KEXEC)) {
+		binary_runtime_size[BINARY_STAGED] -= size_to_remove;
+		binary_runtime_size[BINARY] +=
+					binary_runtime_size[BINARY_STAGED];
+		binary_runtime_size[BINARY_STAGED] =
+					binary_runtime_size[BINARY];
+	}
+
+	if (ima_flush_htable) {
+		old_queue = ima_alloc_replace_htable();
+		if (IS_ERR(old_queue)) {
+			mutex_unlock(&ima_extend_list_mutex);
+			return PTR_ERR(old_queue);
+		}
+	}
+
+	/*
+	 * Splice (prepend) any remaining non-deleted staged entries to the
+	 * active list (RCU not needed, there cannot be concurrent readers).
+	 */
+	list_splice(&ima_measurements_staged, &ima_measurements);
+	INIT_LIST_HEAD(&ima_measurements_staged);
+	mutex_unlock(&ima_extend_list_mutex);
+
+	if (ima_flush_htable) {
+		synchronize_rcu();
+		kfree(old_queue);
+	}
+
+	list_for_each_entry_safe(qe, qe_tmp, &ima_measurements_trim, later) {
+		/*
+		 * Safe to free template_data here without synchronize_rcu()
+		 * because the only htable reader, ima_lookup_digest_entry(),
+		 * accesses only entry->digests, not template_data. If new
+		 * htable readers are added that access template_data, a
+		 * synchronize_rcu() is required here.
+		 */
+		for (i = 0; i < qe->entry->template_desc->num_fields; i++) {
+			kfree(qe->entry->template_data[i].data);
+			qe->entry->template_data[i].data = NULL;
+			qe->entry->template_data[i].len = 0;
+		}
+
+		list_del(&qe->later);
+
+		/* No leak if !ima_flush_htable, referenced by ima_htable. */
+		if (ima_flush_htable) {
+			kfree(qe->entry->digests);
+			kfree(qe->entry);
+			kfree(qe);
+		}
+	}
+
+	return 0;
+}
+
 int ima_restore_measurement_entry(struct ima_template_entry *entry)
 {
 	int result = 0;
-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 2/3] ima: Replace static htable queue with dynamically allocated array
From: Roberto Sassu @ 2026-03-11 17:19 UTC (permalink / raw)
  To: corbet, skhan, zohar, dmitry.kasatkin, eric.snowberg, paul,
	jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, chenste, nramas, Roberto Sassu
In-Reply-To: <20260311171956.2317781-1-roberto.sassu@huaweicloud.com>

From: Roberto Sassu <roberto.sassu@huawei.com>

The IMA hash table is a fixed-size array of hlist_head buckets:

    struct hlist_head ima_htable[IMA_MEASURE_HTABLE_SIZE];

IMA_MEASURE_HTABLE_SIZE is (1 << IMA_HASH_BITS) = 1024 buckets, each a
struct hlist_head (one pointer, 8 bytes on 64-bit). That is 8 KiB allocated
in BSS for every kernel, regardless of whether IMA is ever used, and
regardless of how many measurements are actually made.

Replace the fixed-size array with a RCU-protected pointer to a dynamically
allocated array that is initialized in ima_init_htable(), which is called
from ima_init() during early boot. ima_init_htable() calls the static
function ima_alloc_replace_htable() which, other than initializing the hash
table the first time, can also hot-swap the existing hash table with a
blank one.

The allocation in ima_alloc_replace_htable() uses kcalloc() so the buckets
are zero-initialised (equivalent to HLIST_HEAD_INIT { .first = NULL }).
Callers of ima_alloc_replace_htable() must call synchronize_rcu() and free
the returned hash table.

Finally, access the hash table with rcu_dereference() in
ima_lookup_digest_entry() (reader side) and with
rcu_dereference_protected() in ima_add_digest_entry() (writer side).

No functional change: bucket count, hash function, and all locking remain
identical.

Link: https://github.com/linux-integrity/linux/issues/1
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
Changelog:
v2:
 - Not present in this version

v1:
 - Not present in this version
---
 security/integrity/ima/ima.h       |  3 +-
 security/integrity/ima/ima_init.c  |  5 +++
 security/integrity/ima/ima_queue.c | 49 +++++++++++++++++++++++++++---
 3 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 1f2c81ec0fba..ccd037d49de7 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -285,6 +285,7 @@ bool ima_template_has_modsig(const struct ima_template_desc *ima_template);
 int ima_restore_measurement_entry(struct ima_template_entry *entry);
 int ima_restore_measurement_list(loff_t bufsize, void *buf);
 int ima_measurements_show(struct seq_file *m, void *v);
+int __init ima_init_htable(void);
 unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
 void ima_init_template_list(void);
@@ -300,7 +301,7 @@ extern spinlock_t ima_queue_lock;
 
 extern atomic_long_t ima_num_entries;
 extern atomic_long_t ima_num_violations;
-extern struct hlist_head ima_htable[IMA_MEASURE_HTABLE_SIZE];
+extern struct hlist_head __rcu *ima_htable;
 
 static inline unsigned int ima_hash_key(u8 *digest)
 {
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index a2f34f2d8ad7..7e0aa09a12e6 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -140,6 +140,11 @@ int __init ima_init(void)
 	rc = ima_init_digests();
 	if (rc != 0)
 		return rc;
+
+	rc = ima_init_htable();
+	if (rc != 0)
+		return rc;
+
 	rc = ima_add_boot_aggregate();	/* boot aggregate must be first entry */
 	if (rc != 0)
 		return rc;
diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
index 4837fc6d9ada..2050b9d21e70 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -38,9 +38,7 @@ atomic_long_t ima_num_entries = ATOMIC_LONG_INIT(0);
 atomic_long_t ima_num_violations = ATOMIC_LONG_INIT(0);
 
 /* key: inode (before secure-hashing a file) */
-struct hlist_head ima_htable[IMA_MEASURE_HTABLE_SIZE] = {
-	[0 ... IMA_MEASURE_HTABLE_SIZE - 1] = HLIST_HEAD_INIT
-};
+struct hlist_head __rcu *ima_htable;
 
 /* mutex protects atomicity of extending measurement list
  * and extending the TPM PCR aggregate. Since tpm_extend can take
@@ -54,17 +52,54 @@ static DEFINE_MUTEX(ima_extend_list_mutex);
  */
 static bool ima_measurements_suspended;
 
+/* Callers must call synchronize_rcu() and free the hash table. */
+static struct hlist_head *ima_alloc_replace_htable(void)
+{
+	struct hlist_head *old_htable, *new_htable;
+
+	/* Initializing to zeros is equivalent to call HLIST_HEAD_INIT. */
+	new_htable = kcalloc(IMA_MEASURE_HTABLE_SIZE, sizeof(struct hlist_head),
+			     GFP_KERNEL);
+	if (!new_htable)
+		return ERR_PTR(-ENOMEM);
+
+	old_htable = rcu_replace_pointer(ima_htable, new_htable,
+				lockdep_is_held(&ima_extend_list_mutex));
+
+	return old_htable;
+}
+
+int __init ima_init_htable(void)
+{
+	struct hlist_head *old_htable;
+
+	mutex_lock(&ima_extend_list_mutex);
+	old_htable = ima_alloc_replace_htable();
+	mutex_unlock(&ima_extend_list_mutex);
+
+	/* Synchronize_rcu() and kfree() not necessary, only for robustness. */
+	synchronize_rcu();
+
+	if (IS_ERR(old_htable))
+		return PTR_ERR(old_htable);
+
+	kfree(old_htable);
+	return 0;
+}
+
 /* lookup up the digest value in the hash table, and return the entry */
 static struct ima_queue_entry *ima_lookup_digest_entry(u8 *digest_value,
 						       int pcr)
 {
 	struct ima_queue_entry *qe, *ret = NULL;
+	struct hlist_head *htable;
 	unsigned int key;
 	int rc;
 
 	key = ima_hash_key(digest_value);
 	rcu_read_lock();
-	hlist_for_each_entry_rcu(qe, &ima_htable[key], hnext) {
+	htable = rcu_dereference(ima_htable);
+	hlist_for_each_entry_rcu(qe, &htable[key], hnext) {
 		rc = memcmp(qe->entry->digests[ima_hash_algo_idx].digest,
 			    digest_value, hash_digest_size[ima_hash_algo]);
 		if ((rc == 0) && (qe->entry->pcr == pcr)) {
@@ -104,6 +139,7 @@ static int ima_add_digest_entry(struct ima_template_entry *entry,
 				bool update_htable)
 {
 	struct ima_queue_entry *qe;
+	struct hlist_head *htable;
 	unsigned int key;
 
 	qe = kmalloc_obj(*qe);
@@ -116,10 +152,13 @@ static int ima_add_digest_entry(struct ima_template_entry *entry,
 	INIT_LIST_HEAD(&qe->later);
 	list_add_tail_rcu(&qe->later, &ima_measurements);
 
+	htable = rcu_dereference_protected(ima_htable,
+				lockdep_is_held(&ima_extend_list_mutex));
+
 	atomic_long_inc(&ima_num_entries);
 	if (update_htable) {
 		key = ima_hash_key(entry->digests[ima_hash_algo_idx].digest);
-		hlist_add_head_rcu(&qe->hnext, &ima_htable[key]);
+		hlist_add_head_rcu(&qe->hnext, &htable[key]);
 	}
 
 	if (binary_runtime_size != ULONG_MAX) {
-- 
2.43.0


^ permalink raw reply related

* [PATCH v3 1/3] ima: Remove ima_h_table structure
From: Roberto Sassu @ 2026-03-11 17:19 UTC (permalink / raw)
  To: corbet, skhan, zohar, dmitry.kasatkin, eric.snowberg, paul,
	jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, chenste, nramas, Roberto Sassu

From: Roberto Sassu <roberto.sassu@huawei.com>

With the upcoming change of dynamically allocating and replacing the hash
table, we would need to keep the counters for number of measurements
entries and violations.

Since anyway, those counters don't belong there, remove the ima_h_table
structure instead and move the counters and the hash table as a separate
variables.

Link: https://github.com/linux-integrity/linux/issues/1
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
Changelog:
v2:
 - Not present in this version

v1:
 - Not present in this version
---
 security/integrity/ima/ima.h       |  9 +++------
 security/integrity/ima/ima_api.c   |  2 +-
 security/integrity/ima/ima_fs.c    | 19 +++++++++----------
 security/integrity/ima/ima_kexec.c |  2 +-
 security/integrity/ima/ima_queue.c | 17 ++++++++++-------
 5 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index c38a9eb945b6..1f2c81ec0fba 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -298,12 +298,9 @@ int ima_lsm_policy_change(struct notifier_block *nb, unsigned long event,
  */
 extern spinlock_t ima_queue_lock;
 
-struct ima_h_table {
-	atomic_long_t len;	/* number of stored measurements in the list */
-	atomic_long_t violations;
-	struct hlist_head queue[IMA_MEASURE_HTABLE_SIZE];
-};
-extern struct ima_h_table ima_htable;
+extern atomic_long_t ima_num_entries;
+extern atomic_long_t ima_num_violations;
+extern struct hlist_head ima_htable[IMA_MEASURE_HTABLE_SIZE];
 
 static inline unsigned int ima_hash_key(u8 *digest)
 {
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index 0916f24f005f..122d127e108d 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -146,7 +146,7 @@ void ima_add_violation(struct file *file, const unsigned char *filename,
 	int result;
 
 	/* can overflow, only indicator */
-	atomic_long_inc(&ima_htable.violations);
+	atomic_long_inc(&ima_num_violations);
 
 	result = ima_alloc_init_template(&event_data, &entry, NULL);
 	if (result < 0) {
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index ca4931a95098..aaa460d70ff7 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -38,8 +38,8 @@ __setup("ima_canonical_fmt", default_canonical_fmt_setup);
 
 static int valid_policy = 1;
 
-static ssize_t ima_show_htable_value(char __user *buf, size_t count,
-				     loff_t *ppos, atomic_long_t *val)
+static ssize_t ima_show_counter(char __user *buf, size_t count, loff_t *ppos,
+				atomic_long_t *val)
 {
 	char tmpbuf[32];	/* greater than largest 'long' string value */
 	ssize_t len;
@@ -48,15 +48,14 @@ static ssize_t ima_show_htable_value(char __user *buf, size_t count,
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, len);
 }
 
-static ssize_t ima_show_htable_violations(struct file *filp,
-					  char __user *buf,
-					  size_t count, loff_t *ppos)
+static ssize_t ima_show_num_violations(struct file *filp, char __user *buf,
+				       size_t count, loff_t *ppos)
 {
-	return ima_show_htable_value(buf, count, ppos, &ima_htable.violations);
+	return ima_show_counter(buf, count, ppos, &ima_num_violations);
 }
 
-static const struct file_operations ima_htable_violations_ops = {
-	.read = ima_show_htable_violations,
+static const struct file_operations ima_num_violations_ops = {
+	.read = ima_show_num_violations,
 	.llseek = generic_file_llseek,
 };
 
@@ -64,7 +63,7 @@ static ssize_t ima_show_measurements_count(struct file *filp,
 					   char __user *buf,
 					   size_t count, loff_t *ppos)
 {
-	return ima_show_htable_value(buf, count, ppos, &ima_htable.len);
+	return ima_show_counter(buf, count, ppos, &ima_num_entries);
 
 }
 
@@ -545,7 +544,7 @@ int __init ima_fs_init(void)
 	}
 
 	dentry = securityfs_create_file("violations", S_IRUSR | S_IRGRP,
-				   ima_dir, NULL, &ima_htable_violations_ops);
+				   ima_dir, NULL, &ima_num_violations_ops);
 	if (IS_ERR(dentry)) {
 		ret = PTR_ERR(dentry);
 		goto out;
diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
index 36a34c54de58..5801649fbbef 100644
--- a/security/integrity/ima/ima_kexec.c
+++ b/security/integrity/ima/ima_kexec.c
@@ -43,7 +43,7 @@ void ima_measure_kexec_event(const char *event_name)
 	int n;
 
 	buf_size = ima_get_binary_runtime_size();
-	len = atomic_long_read(&ima_htable.len);
+	len = atomic_long_read(&ima_num_entries);
 
 	n = scnprintf(ima_kexec_event, IMA_KEXEC_EVENT_LEN,
 		      "kexec_segment_size=%lu;ima_binary_runtime_size=%lu;"
diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
index 319522450854..4837fc6d9ada 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -32,11 +32,14 @@ static unsigned long binary_runtime_size;
 static unsigned long binary_runtime_size = ULONG_MAX;
 #endif
 
+/* num of stored meas. in the list */
+atomic_long_t ima_num_entries = ATOMIC_LONG_INIT(0);
+/* num of violations in the list */
+atomic_long_t ima_num_violations = ATOMIC_LONG_INIT(0);
+
 /* key: inode (before secure-hashing a file) */
-struct ima_h_table ima_htable = {
-	.len = ATOMIC_LONG_INIT(0),
-	.violations = ATOMIC_LONG_INIT(0),
-	.queue[0 ... IMA_MEASURE_HTABLE_SIZE - 1] = HLIST_HEAD_INIT
+struct hlist_head ima_htable[IMA_MEASURE_HTABLE_SIZE] = {
+	[0 ... IMA_MEASURE_HTABLE_SIZE - 1] = HLIST_HEAD_INIT
 };
 
 /* mutex protects atomicity of extending measurement list
@@ -61,7 +64,7 @@ static struct ima_queue_entry *ima_lookup_digest_entry(u8 *digest_value,
 
 	key = ima_hash_key(digest_value);
 	rcu_read_lock();
-	hlist_for_each_entry_rcu(qe, &ima_htable.queue[key], hnext) {
+	hlist_for_each_entry_rcu(qe, &ima_htable[key], hnext) {
 		rc = memcmp(qe->entry->digests[ima_hash_algo_idx].digest,
 			    digest_value, hash_digest_size[ima_hash_algo]);
 		if ((rc == 0) && (qe->entry->pcr == pcr)) {
@@ -113,10 +116,10 @@ static int ima_add_digest_entry(struct ima_template_entry *entry,
 	INIT_LIST_HEAD(&qe->later);
 	list_add_tail_rcu(&qe->later, &ima_measurements);
 
-	atomic_long_inc(&ima_htable.len);
+	atomic_long_inc(&ima_num_entries);
 	if (update_htable) {
 		key = ima_hash_key(entry->digests[ima_hash_algo_idx].digest);
-		hlist_add_head_rcu(&qe->hnext, &ima_htable.queue[key]);
+		hlist_add_head_rcu(&qe->hnext, &ima_htable[key]);
 	}
 
 	if (binary_runtime_size != ULONG_MAX) {
-- 
2.43.0


^ permalink raw reply related

* Re: LSM namespacing API
From: Casey Schaufler @ 2026-03-11 16:37 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Dr. Greg, Paul Moore, Ondrej Mosnacek, linux-security-module,
	selinux, John Johansen, Casey Schaufler
In-Reply-To: <CAEjxPJ7yuJ6sAZ-ViqT04M5WPC9O39m5UUGw2f3+GDR87tvbsA@mail.gmail.com>

On 3/9/2026 11:15 AM, Stephen Smalley wrote:
> On Fri, Mar 6, 2026 at 4:01 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 3/6/2026 9:48 AM, Dr. Greg wrote:
>>> On Tue, Mar 03, 2026 at 11:46:53AM -0500, Paul Moore wrote:
>>>
>>> Good morning, I hope the week is winding down well for everyone.
>>>
>>>> On Tue, Mar 3, 2026 at 8:30???AM Stephen Smalley
>>>>> I think my only caveat here is that your proposal is quite a bit more
>>>>> complex than what I implemented here:
>>>>> [1] https://lore.kernel.org/selinux/20251003190959.3288-2-stephen.smalley.work@gmail.com/
>>>>> [2] https://lore.kernel.org/selinux/20251003191328.3605-1-stephen.smalley.work@gmail.com/
>>>>> and I'm not sure the extra complexity is worth it.
>>>>>
>>>>> In particular:
>>>>> 1. Immediately unsharing the namespace upon lsm_set_self_attr() allows
>>>>> the caller to immediately and unambiguously know if the operation is
>>>>> supported and allowed ...
>>>> Performing the unshare operation immediately looks much less like a
>>>> LSM attribute and more like its own syscall.  That isn't a problem
>>>> in my eyes, it just means if this is the direction we want to go we
>>>> should implement a lsm_unshare(2) API, or something similar.
>>> Stephen's take on this is correct, the least complicated path forward
>>> is a simple call, presumably lsm_unshare(2), that instructs the LSM(s)
>>> to carry out whatever is needed to create a new security namespace.
>>>
>>> There are only two public implementations of what can be referred to
>>> as major security namespacing efforts; Stephen's work with SeLinux and
>>> our TSEM implementation.
>> Please be just a tiny bit careful before you make this sort of assertion:
>>
>>         https://lwn.net/Articles/645403/
> I believe both AppArmor and TOMOYO also have namespacing
> implementations already upstream, so SELinux is certainly not the only
> one. Looks like the Smack implementation you cited above was based on
> extending user namespaces rather than purely Smack-internal like the
> others; is that why it wasn't ultimately merged?

Less sophisticated solutions to the problem Smack namespaces were
intended to address became available. The effort stalled for lack of
a use case that required it. Much of what you want a namespace for
can be accomplished using process specific rules.



^ permalink raw reply

* Re: [PATCH v6] lsm: Add LSM hook security_unix_find
From: Paul Moore @ 2026-03-11 16:08 UTC (permalink / raw)
  To: Justin Suess
  Cc: Günther Noack, brauner, demiobenour, fahimitahera, hi, horms,
	ivanov.mikhail1, jannh, jmorris, john.johansen,
	konstantin.meskhidze, linux-security-module, m, matthieu, mic,
	netdev, samasth.norway.ananda, serge, viro
In-Reply-To: <abFhawSTjNoa-KaH@suesslenovo>

On Wed, Mar 11, 2026 at 8:34 AM Justin Suess <utilityemal77@gmail.com> wrote:
>
> On Tue, Mar 10, 2026 at 06:39:12PM -0400, Paul Moore wrote:
> > On Thu, Feb 19, 2026 at 3:26 PM Günther Noack <gnoack3000@gmail.com> wrote:
> > > On Thu, Feb 19, 2026 at 03:04:59PM -0500, Justin Suess wrote:
> > > > Add a LSM hook security_unix_find.
> > > >
> > > > This hook is called to check the path of a named unix socket before a
> > > > connection is initiated. The peer socket may be inspected as well.
> > > >
> > > > Why existing hooks are unsuitable:
> > > >
> > > > Existing socket hooks, security_unix_stream_connect(),
> > > > security_unix_may_send(), and security_socket_connect() don't provide
> > > > TOCTOU-free / namespace independent access to the paths of sockets.
> > > >
> > > > (1) We cannot resolve the path from the struct sockaddr in existing hooks.
> > > > This requires another path lookup. A change in the path between the
> > > > two lookups will cause a TOCTOU bug.
> > > >
> > > > (2) We cannot use the struct path from the listening socket, because it
> > > > may be bound to a path in a different namespace than the caller,
> > > > resulting in a path that cannot be referenced at policy creation time.
> > > >
> > > > Cc: Günther Noack <gnoack3000@gmail.com>
> > > > Cc: Tingmao Wang <m@maowtm.org>
> > > > Signed-off-by: Justin Suess <utilityemal77@gmail.com>
> > > > ---
> > > >  include/linux/lsm_hook_defs.h |  5 +++++
> > > >  include/linux/security.h      | 11 +++++++++++
> > > >  net/unix/af_unix.c            | 13 ++++++++++---
> > > >  security/security.c           | 20 ++++++++++++++++++++
> > > >  4 files changed, 46 insertions(+), 3 deletions(-)
> >
> > ...
> >
> > > Reviewed-by: Günther Noack <gnoack3000@gmail.com>
> > >
> > > Thank you, this looks good. I'll include it in the next version of the
> > > Unix connect patch set again.
> >
> > I'm looking for this patchset to review/ACK the new hook in context,
> > but I'm not seeing it in my inbox or lore.  Did I simply miss the
> > patchset or is it still a work in progress?  No worries if it hasn't
> > been posted yet, I just wanted to make sure I wasn't holding this up
> > any more than I already may have :)
>
> Good Morning Paul,
>
> Can't speak to the rest of the patch, but I sent this LSM hook for
> review purposes before inclusion with the rest of the V6 of this patch.
>
> Günther added his review tag, but I was asked to make some minor comment / commit
> message updates. I sent the same patch, with updated comments/commit to him
> in a follow up, off-list email to avoid spamming the list. No code changes were
> made, just comments.
>
> I don't think this particular patch will change substantially, unless we find
> something unexpected. But the way we use the hook may change (esp wrt to
> locking and the SOCK_DEAD state), which is important for your review.
>
> So you may want to hold off your review until the full V6 series gets sent so
> you can review the hook in context. There were some questions about
> locking that needed proper digging into. [1]

Great, thanks for the update, that was helpful.  As you recommend,
I'll hold off on reviewing this further until we have the full context
of the other patchset; we've already talked about this hook addition a
few times anyway, and based on a quick look yesterday, nothing
particularly evil jumped out at me.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH 0/3] Firmware LSM hook
From: Paul Moore @ 2026-03-11 16:06 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: James Morris, Serge E. Hallyn, Jason Gunthorpe, Saeed Mahameed,
	Itay Avraham, Dave Jiang, Jonathan Cameron, linux-security-module,
	linux-kernel, linux-rdma, Chiara Meiohas, Maher Sanalla,
	Edward Srouji
In-Reply-To: <20260311081955.GS12611@unreal>

On Wed, Mar 11, 2026 at 4:20 AM Leon Romanovsky <leon@kernel.org> wrote:
> On Tue, Mar 10, 2026 at 05:40:02PM -0400, Paul Moore wrote:
> > On Tue, Mar 10, 2026 at 3:30 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > On Tue, Mar 10, 2026 at 02:24:40PM -0400, Paul Moore wrote:
> > > > On Tue, Mar 10, 2026 at 5:07 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > On Mon, Mar 09, 2026 at 07:10:25PM -0400, Paul Moore wrote:
> > > > > > On Mon, Mar 9, 2026 at 3:37 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > > On Mon, Mar 09, 2026 at 02:32:39PM -0400, Paul Moore wrote:
> > > > > > > > On Mon, Mar 9, 2026 at 7:15 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > ...
> > > >
> > > > > > > > Hi Leon,
> > > > > > > >
> > > > > > > > At the link below, you'll find guidance on submitting new LSM hooks.
> > > > > > > > Please take a look and let me know if you have any questions.
> > > > > > > >
> > > > > > > > https://github.com/LinuxSecurityModule/kernel/blob/main/README.md#new-lsm-hooks
> > > > > > >
> > > > > > > I assume that you are referring to this part:
> > > > > >
> > > > > > I'm referring to all of the guidance, but yes, at the very least that
> > > > > > is something that I think we need to see in a future revision of this
> > > > > > patchset.
> > > > > >
> > > > > > >  * New LSM hooks must demonstrate their usefulness by providing a meaningful
> > > > > > >    implementation for at least one in-kernel LSM. The goal is to demonstrate
> > > > > > >    the purpose and expected semantics of the hooks. Out of tree kernel code,
> > > > > > >    and pass through implementations, such as the BPF LSM, are not eligible
> > > > > > >    for LSM hook reference implementations.
> > > > > > >
> > > > > > > The point is that we are not inspecting a kernel call, but the FW mailbox,
> > > > > > > which has very little meaning to the kernel. From the kernel's perspective,
> > > > > > > all relevant checks have already been performed, but the existing capability
> > > > > > > granularity does not allow us to distinguish between FW_CMD1 and FW_CMD2.
> > > > > >
> > > > > > It might help if you could phrase this differently, as I'm not
> > > > > > entirely clear on your argument.  LSMs are not limited to enforcing
> > > > > > access controls on requests the kernel understands (see the SELinux
> > > > > > userspace object manager concept), and the idea of access controls
> > > > > > with greater granularity than capabilities is one of the main reasons
> > > > > > people look to LSMs for access control (SELinux, AppArmor, Smack,
> > > > > > etc.).
> > > > >
> > > > > I should note that my understanding of LSM is limited, so some parts of my
> > > > > answers may be inaccurate.
> > > > >
> > > > > What I am referring to is a different level of granularity — specifically,
> > > > > the internals of the firmware commands. In the proposed approach, BPF
> > > > > programs would make decisions based on data passed through the mailbox.
> > > > > That mailbox format varies across vendors, and may even differ between
> > > > > firmware versions from the same vendor.
> > > >
> > > > That helps, thank you.
> > > >
> > > > > > > Here we propose a generic interface that can be applied to all FWCTL
> > > > > > > devices without out-of-tree kernel code at all.
> > > > > >
> > > > > > I expected to see a patch implementing some meaningful support for
> > > > > > access controls using these hooks in one of the existing LSMs, I did
> > > > > > not see that in this patchset.
> > > > >
> > > > > In some cases, the mailbox is forwarded from user space unchanged, but
> > > > > in others the kernel modifies it before submitting it to the FW.
> > > >
> > > > Without a standard format, opcode definitions, etc. I suspect
> > > > integrating this into an LSM will present a number of challenges.
> > >
> > > The opcode is relatively easy to extract from the mailbox and pass to the LSM.
> > > All drivers implement some variant of mlx5ctl_validate_rpc()/devx_is_general_cmd()
> > > to validate the opcode. The problem is that this check alone is not sufficient.
> > >
> > > > Instead of performing an LSM access control check before submitting
> > > > the firmware command, it might be easier from an LSM perspective to
> > > > have the firmware call into the kernel/LSM for an access control
> > > > decision before performing a security-relevant action.
> > >
> > > Ultimately, the LSM must make a decision for each executed firmware
> > > command. This will need to be handled one way or another, and will
> > > likely require parsing the mailbox again.
> >
> > As it's unlikely that parsing the mailbox is something that a LSM will
> > want to handle,
>
> I believe this approach offers the cleanest and most natural way to support
> all mailbox‑based devices.
>
> > my suggestion was to leverage the existing mailbox parsing in the firmware
> > and require the firmware to call into the LSM when authorization is needed.
> >
> > > > This removes the challenge of parsing/interpreting the arbitrary firmware commands,
> > > > but it does add some additional complexity of having to generically
> > > > represent the security relevant actions the firmware might request
> > >
> > > The difference here is that the proposed LSM hook is intended to disable
> > > certain functionality provided by the firmware, effectively depending on
> > > the operator’s preferences.
> >
> > My suggestion would also allow a LSM hook to disable certain firmware
> > functionality; however, the firmware itself would need to call the LSM
> > to check if the functionality is authorized.
>
> This suggestion adds an extra call from the FW to the LSM for every command, even
> for systems which don't have LSM at all.

If latency is a concern, I imagine we could create an LSM hook to
report whether any LSMs provided firmware access controls.  The
firmware could then use that hook, potentially caching the result, to
limit its calls into the LSM.

> The FW must pass the already parsed data
> back to the LSM; otherwise, the LSM   has no basis to decide whether to accept or
> reject the request.
>
> For example, consider the MLX5_CMD_OP_QUERY_DCT command handled in
> mlx5ctl_validate_rpc(). DCT in RDMA refers to Dynamically Connected
> Transport, a Mellanox-specific extension that effectively introduces a new
> QP‑type family on top of the standard RC/UC/UD transports. This type does not
> exist for other vendors, each of whom provides its own vendor‑specific
> extensions. All parameters here are tightly coupled to those specific
> commands.
>
> It is unrealistic to expect different firmware implementations to supply
> their data in a common format that would allow the LSM to make a generic
> decision.

That's unfortunate as that would be the easiest path forward.
Regardless, you are welcome to work on whatever implementation you
think makes sense for any of the in-tree LSMs, with that in place we
can take another look at the firmware command hooks.

Good luck.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH 01/61] Coccinelle: Prefer IS_ERR_OR_NULL over manual NULL check
From: Markus Elfring @ 2026-03-11 15:12 UTC (permalink / raw)
  To: Philipp Hahn, cocci, Julia Lawall, Nicolas Palix
  Cc: amd-gfx, apparmor, bpf, ceph-devel, dm-devel, dri-devel, gfs2,
	intel-gfx, intel-wired-lan, iommu, kvm, linux-arm-kernel,
	linux-block, linux-bluetooth, linux-btrfs, linux-cifs, linux-clk,
	linux-erofs, linux-ext4, linux-fsdevel, linux-gpio, linux-hyperv,
	linux-input, linux-leds, linux-media, linux-mips, linux-mm,
	linux-modules, linux-mtd, linux-nfs, linux-omap, linux-phy,
	linux-pm, linux-rockchip, linux-s390, linux-scsi, linux-sctp,
	linux-security-module, linux-sh, linux-sound, linux-stm32,
	linux-trace-kernel, linux-usb, linux-wireless, netdev, ntfs3,
	samba-technical, sched-ext, target-devel, tipc-discussion, v9fs,
	LKML
In-Reply-To: <20260310-b4-is_err_or_null-v1-1-bd63b656022d@avm.de>

…
> +// Confidence: High

Some contributors presented discerning comments for this change approach.
Thus I became also curious how much they can eventually be taken better into account
by the means of the semantic patch language (Coccinelle software).

…
+@p1 depends on patch@
+expression E;
+@@
+(
> +-	E != NULL && !IS_ERR(E)
> ++	!IS_ERR_OR_NULL(E)
> +|
> +-	E == NULL || IS_ERR(E)
> ++	IS_ERR_OR_NULL(E)
> +|
> +-	!IS_ERR(E) && E != NULL
> ++	!IS_ERR_OR_NULL(E)
> +|
> +-	IS_ERR(E) || E == NULL
> ++	IS_ERR_OR_NULL(E)
> +)

Several detected expressions should refer to return values from function calls.
https://en.wikipedia.org/wiki/Return_statement

* Do any development challenges hinder still the determination of corresponding
  failure predicates?

* How will interests evolve to improve data processing any further for such
  use cases?

Regards,
Markus

^ permalink raw reply

* Re: [PATCH 15/61] trace: Prefer IS_ERR_OR_NULL over manual NULL check
From: Geert Uytterhoeven @ 2026-03-11 14:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu (Google), Philipp Hahn, amd-gfx, apparmor, bpf,
	ceph-devel, cocci, dm-devel, dri-devel, gfs2, intel-gfx,
	intel-wired-lan, iommu, kvm, linux-arm-kernel, linux-block,
	linux-bluetooth, linux-btrfs, linux-cifs, linux-clk, linux-erofs,
	linux-ext4, linux-fsdevel, linux-gpio, linux-hyperv, linux-input,
	linux-kernel, linux-leds, linux-media, linux-mips, linux-mm,
	linux-modules, linux-mtd, linux-nfs, linux-omap, linux-phy,
	linux-pm, linux-rockchip, linux-s390, linux-scsi, linux-sctp,
	linux-security-module, linux-sh, linux-sound, linux-stm32,
	linux-trace-kernel, linux-usb, linux-wireless, netdev, ntfs3,
	samba-technical, sched-ext, target-devel, tipc-discussion, v9fs,
	Mathieu Desnoyers
In-Reply-To: <20260311100332.6a2ce4b1@gandalf.local.home>

Hi Steven,

On Wed, 11 Mar 2026 at 15:03, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 11 Mar 2026 14:13:32 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
>
> > Hmm, now IS_ERR_OR_NULL() is an inline function, so it is safe.
> > But if you want to use IS_ERR_OR_NULL() here, it will be better something like
> >
> > node = rhashtable_walk_next(&iter);
> > while (!IS_ERR_OR_NULL(node)) {
> >       fprobe_remove_node_in_module(mod, node, &alist);
> >       node = rhashtable_walk_next(&iter);
> > }
>
> But now you need to have a duplicate code in order to acquire "node"
>
> I think the patch just makes the code worse.

Obviously we need a new for_each_*() helper hiding all the gory internals?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH 15/61] trace: Prefer IS_ERR_OR_NULL over manual NULL check
From: Steven Rostedt @ 2026-03-11 14:03 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Philipp Hahn, amd-gfx, apparmor, bpf, ceph-devel, cocci, dm-devel,
	dri-devel, gfs2, intel-gfx, intel-wired-lan, iommu, kvm,
	linux-arm-kernel, linux-block, linux-bluetooth, linux-btrfs,
	linux-cifs, linux-clk, linux-erofs, linux-ext4, linux-fsdevel,
	linux-gpio, linux-hyperv, linux-input, linux-kernel, linux-leds,
	linux-media, linux-mips, linux-mm, linux-modules, linux-mtd,
	linux-nfs, linux-omap, linux-phy, linux-pm, linux-rockchip,
	linux-s390, linux-scsi, linux-sctp, linux-security-module,
	linux-sh, linux-sound, linux-stm32, linux-trace-kernel, linux-usb,
	linux-wireless, netdev, ntfs3, samba-technical, sched-ext,
	target-devel, tipc-discussion, v9fs, Mathieu Desnoyers
In-Reply-To: <20260311141332.b611237d36b61b2409e66cb3@kernel.org>

On Wed, 11 Mar 2026 14:13:32 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> Hmm, now IS_ERR_OR_NULL() is an inline function, so it is safe.
> But if you want to use IS_ERR_OR_NULL() here, it will be better something like
> 
> node = rhashtable_walk_next(&iter);
> while (!IS_ERR_OR_NULL(node)) {
> 	fprobe_remove_node_in_module(mod, node, &alist);
> 	node = rhashtable_walk_next(&iter);
> }

But now you need to have a duplicate code in order to acquire "node"

I think the patch just makes the code worse.

-- Steve

^ permalink raw reply

* Re: [PATCH 36/61] arch/sh: Prefer IS_ERR_OR_NULL over manual NULL check
From: Geert Uytterhoeven @ 2026-03-11 13:15 UTC (permalink / raw)
  To: Philipp Hahn
  Cc: amd-gfx, apparmor, bpf, ceph-devel, cocci, dm-devel, dri-devel,
	gfs2, intel-gfx, intel-wired-lan, iommu, kvm, linux-arm-kernel,
	linux-block, linux-bluetooth, linux-btrfs, linux-cifs, linux-clk,
	linux-erofs, linux-ext4, linux-fsdevel, linux-gpio, linux-hyperv,
	linux-input, linux-kernel, linux-leds, linux-media, linux-mips,
	linux-mm, linux-modules, linux-mtd, linux-nfs, linux-omap,
	linux-phy, linux-pm, linux-rockchip, linux-s390, linux-scsi,
	linux-sctp, linux-security-module, linux-sh, linux-sound,
	linux-stm32, linux-trace-kernel, linux-usb, linux-wireless,
	netdev, ntfs3, samba-technical, sched-ext, target-devel,
	tipc-discussion, v9fs, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz
In-Reply-To: <20260310-b4-is_err_or_null-v1-36-bd63b656022d@avm.de>

On Tue, 10 Mar 2026 at 12:56, Philipp Hahn <phahn-oss@avm.de> wrote:
> Prefer using IS_ERR_OR_NULL() over using IS_ERR() and a manual NULL
> check.
>
> Change generated with coccinelle.
>
> To: Yoshinori Sato <ysato@users.sourceforge.jp>
> To: Rich Felker <dalias@libc.org>
> To: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> Cc: linux-sh@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Philipp Hahn <phahn-oss@avm.de>

Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH v4 15/17] module: Introduce hash-based integrity checking
From: Thomas Weißschuh @ 2026-03-11 13:19 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Nathan Chancellor, Arnd Bergmann, Luis Chamberlain, Petr Pavlu,
	Sami Tolvanen, Daniel Gomez, Paul Moore, James Morris,
	Serge E. Hallyn, Jonathan Corbet, Madhavan Srinivasan,
	Michael Ellerman, Nicholas Piggin, Naveen N Rao, Mimi Zohar,
	Roberto Sassu, Dmitry Kasatkin, Eric Snowberg, Nicolas Schier,
	Daniel Gomez, Aaron Tomlin, Christophe Leroy (CS GROUP),
	Nicolas Schier, Nicolas Bouchinet, Xiu Jianfeng,
	Fabian Grünbichler, Arnout Engelen, Mattia Rizzolo, kpcyrd,
	Christian Heusel, Câju Mihai-Drosi,
	Sebastian Andrzej Siewior, linux-kbuild, linux-kernel, linux-arch,
	linux-modules, linux-security-module, linux-doc, linuxppc-dev,
	linux-integrity
In-Reply-To: <20260311011218.GA212983@quark>

On 2026-03-10 18:12:18-0700, Eric Biggers wrote:
> On Tue, Jan 13, 2026 at 01:28:59PM +0100, Thomas Weißschuh wrote:
> > The current signature-based module integrity checking has some drawbacks
> > in combination with reproducible builds. Either the module signing key
> > is generated at build time, which makes the build unreproducible, or a
> > static signing key is used, which precludes rebuilds by third parties
> > and makes the whole build and packaging process much more complicated.
> 
> I think this actually undersells the feature.

(...)

> So I think this is how module authentication should have been done
> originally, and I'm glad to see this is finally being fixed.

Thanks, that is nice to hear.

> > +struct module_hashes_proof {
> > +	__be32 pos;
> > +	u8 hash_sigs[][MODULE_HASHES_HASH_SIZE];
> > +} __packed;
> 
> Is the choice of big endian for consistency with struct
> module_signature?  Little endian is the usual choice in new code.

Yes, it's for consistency. But I am fine with either way. Given that
this is essentially an internal ABI, we could always change it later.

> > diff --git a/include/linux/module_signature.h b/include/linux/module_signature.h
> > index a45ce3b24403..3b510651830d 100644
> > --- a/include/linux/module_signature.h
> > +++ b/include/linux/module_signature.h
> > @@ -18,6 +18,7 @@ enum pkey_id_type {
> >  	PKEY_ID_PGP,		/* OpenPGP generated key ID */
> >  	PKEY_ID_X509,		/* X.509 arbitrary subjectKeyIdentifier */
> >  	PKEY_ID_PKCS7,		/* Signature in PKCS#7 message */
> > +	PKEY_ID_MERKLE,		/* Merkle proof for modules */
> 
> I recommend making the hash algorithm explicit:
> 
>         PKEY_ID_MERKLE_SHA256,	/* SHA-256 merkle proof for modules */
> 
> While I wouldn't encourage the addition of another hash algorithm
> (specifying one good algorithm for now is absolutely the right choice),
> if someone ever does need to add another one, we'd want them to be
> guided to simply introduce a new value of this enum rather than hack it
> in some other way.

The idea here was that this will only ever be used for module built as
part of the kernel build. So the actual implementation could change freely
without affecting anything.

But I don't have hard feelings about it.

> > +static void hash_entry(const void *left, const void *right, void *out)
> 
> Byte arrays should use u8 instead of void

Ack.

> > diff --git a/scripts/modules-merkle-tree.c b/scripts/modules-merkle-tree.c
> [...]
> 
> > +struct file_entry {
> > +	char *name;
> > +	unsigned int pos;
> > +	unsigned char hash[EVP_MAX_MD_SIZE];
> 
> Considering that the hash algorithm is fixed, EVP_MAX_MD_SIZE can be
> replaced with a tighter local definition:

Ack.

>     #define MAX_HASH_SIZE 32

IMO it shouldn't even mention 'MAX', as there is only one hash
algorithm.

(...)

> > +{
> > +	fprintf(stderr,
> > +		"Usage: scripts/modules-merkle-tree <root definition>\n");
> > +	exit(2);
> 
> This should show both parameters, <root hash> <new suffix>

Ack.

> But they probably should be flipped to put the output second.

Ack.

> Though, is <new suffix> needed at all?  It looks like it doesn't
> actually affect the output.

It will be required for compatibility with INSTALL_MOD_STRIP,
two patches later. I'll move this code into the later patch.

> > +	hash_evp = EVP_get_digestbyname("sha256");
> 
> EVP_sha256()

(...)

Ack to all other remarks.


Thomas

^ permalink raw reply

* Re: [PATCH v4 09/17] module: Make module loading policy usable without MODULE_SIG
From: Thomas Weißschuh @ 2026-03-11 12:59 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Nathan Chancellor, Arnd Bergmann, Luis Chamberlain, Petr Pavlu,
	Sami Tolvanen, Daniel Gomez, Paul Moore, James Morris,
	Serge E. Hallyn, Jonathan Corbet, Madhavan Srinivasan,
	Michael Ellerman, Nicholas Piggin, Naveen N Rao, Mimi Zohar,
	Roberto Sassu, Dmitry Kasatkin, Eric Snowberg, Nicolas Schier,
	Daniel Gomez, Aaron Tomlin, Christophe Leroy (CS GROUP),
	Nicolas Schier, Nicolas Bouchinet, Xiu Jianfeng,
	Fabian Grünbichler, Arnout Engelen, Mattia Rizzolo, kpcyrd,
	Christian Heusel, Câju Mihai-Drosi,
	Sebastian Andrzej Siewior, linux-kbuild, linux-kernel, linux-arch,
	linux-modules, linux-security-module, linux-doc, linuxppc-dev,
	linux-integrity
In-Reply-To: <20260310220146.GE120274@quark>

On 2026-03-10 15:01:46-0700, Eric Biggers wrote:
> On Tue, Jan 13, 2026 at 01:28:53PM +0100, Thomas Weißschuh wrote:
> > The loading policy functionality will also be used by the hash-based
> > module validation. Split it out from CONFIG_MODULE_SIG so it is usable
> > by both.
> > 
> > Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
> > ---
> >  include/linux/module.h  |  8 ++++----
> >  kernel/module/Kconfig   |  5 ++++-
> >  kernel/module/main.c    | 26 +++++++++++++++++++++++++-
> >  kernel/module/signing.c | 21 ---------------------
> >  4 files changed, 33 insertions(+), 27 deletions(-)
> > 
> > diff --git a/include/linux/module.h b/include/linux/module.h
> > index f288ca5cd95b..f9601cba47cd 100644
> > --- a/include/linux/module.h
> > +++ b/include/linux/module.h
> > @@ -444,7 +444,7 @@ struct module {
> >  	const u32 *gpl_crcs;
> >  	bool using_gplonly_symbols;
> >  
> > -#ifdef CONFIG_MODULE_SIG
> > +#ifdef CONFIG_MODULE_SIG_POLICY
> >  	/* Signature was verified. */
> >  	bool sig_ok;
> >  #endif
> [...]
> > +config MODULE_SIG_POLICY
> > +	def_bool MODULE_SIG
> 
> Maybe MODULE_AUTH_POLICY?  Hash-based module authentication does not use
> signatures.
> 
> This issue appears elsewhere in the code too.  There are lots of places
> that still refer to module signatures or "sigs", when really module
> authentication is meant.
> 
> I'm not sure how far you want to go with the renaming, but it's
> something to think about.  It's confusing to use the term "signature" to
> mean something that is not a signature.

Ack. "authentication" is much better, I'll use that.


Thomas

^ permalink raw reply

* Re: [PATCH v4 06/17] kbuild: add stamp file for vmlinux BTF data
From: Thomas Weißschuh @ 2026-03-11 12:58 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Nathan Chancellor, Arnd Bergmann, Luis Chamberlain, Petr Pavlu,
	Sami Tolvanen, Daniel Gomez, Paul Moore, James Morris,
	Serge E. Hallyn, Jonathan Corbet, Madhavan Srinivasan,
	Michael Ellerman, Nicholas Piggin, Naveen N Rao, Mimi Zohar,
	Roberto Sassu, Dmitry Kasatkin, Eric Snowberg, Nicolas Schier,
	Daniel Gomez, Aaron Tomlin, Christophe Leroy (CS GROUP),
	Nicolas Schier, Nicolas Bouchinet, Xiu Jianfeng,
	Fabian Grünbichler, Arnout Engelen, Mattia Rizzolo, kpcyrd,
	Christian Heusel, Câju Mihai-Drosi,
	Sebastian Andrzej Siewior, linux-kbuild, linux-kernel, linux-arch,
	linux-modules, linux-security-module, linux-doc, linuxppc-dev,
	linux-integrity
In-Reply-To: <20260310213606.GD120274@quark>

On 2026-03-10 14:36:06-0700, Eric Biggers wrote:
> On Tue, Jan 13, 2026 at 01:28:50PM +0100, Thomas Weißschuh wrote:
> > The upcoming module hashes functionality will build the modules in
> > between the generation of the BTF data and the final link of vmlinux.
> > Having a dependency from the modules on vmlinux would make this
> > impossible as it would mean having a cyclic dependency.
> > Break this cyclic dependency by introducing a new target.
> > 
> > Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
> > ---
> >  scripts/Makefile.modfinal | 4 ++--
> >  scripts/link-vmlinux.sh   | 6 ++++++
> >  2 files changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
> > index 149e12ff5700..adfef1e002a9 100644
> > --- a/scripts/Makefile.modfinal
> > +++ b/scripts/Makefile.modfinal
> > @@ -56,8 +56,8 @@ if_changed_except = $(if $(call newer_prereqs_except,$(2))$(cmd-check),      \
> >  	printf '%s\n' 'savedcmd_$@ := $(make-cmd)' > $(dot-target).cmd, @:)
> >  
> >  # Re-generate module BTFs if either module's .ko or vmlinux changed
> > -%.ko: %.o %.mod.o .module-common.o $(objtree)/scripts/module.lds $(and $(CONFIG_DEBUG_INFO_BTF_MODULES),$(KBUILD_BUILTIN),$(objtree)/vmlinux) FORCE
> > -	+$(call if_changed_except,ld_ko_o,$(objtree)/vmlinux)
> > +%.ko: %.o %.mod.o .module-common.o $(objtree)/scripts/module.lds $(and $(CONFIG_DEBUG_INFO_BTF_MODULES),$(KBUILD_BUILTIN),$(objtree)/.tmp_vmlinux_btf.stamp) FORCE
> > +	+$(call if_changed_except,ld_ko_o,$(objtree)/.tmp_vmlinux_btf.stamp)
> >  ifdef CONFIG_DEBUG_INFO_BTF_MODULES
> >  	+$(if $(newer-prereqs),$(call cmd,btf_ko))
> >  endif
> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > index 4ab44c73da4d..8c98f8645a5c 100755
> > --- a/scripts/link-vmlinux.sh
> > +++ b/scripts/link-vmlinux.sh
> > @@ -111,6 +111,7 @@ vmlinux_link()
> >  gen_btf()
> >  {
> >  	local btf_data=${1}.btf.o
> > +	local btf_stamp=.tmp_vmlinux_btf.stamp
> >  
> >  	info BTF "${btf_data}"
> >  	LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J ${PAHOLE_FLAGS} ${1}
> > @@ -131,6 +132,11 @@ gen_btf()
> >  	fi
> >  	printf "${et_rel}" | dd of="${btf_data}" conv=notrunc bs=1 seek=16 status=none
> >  
> > +	info STAMP $btf_stamp
> > +	if ! cmp --silent $btf_data $btf_stamp; then
> > +		cp $btf_data $btf_stamp
> > +	fi

This patch will be gone from the next revision of the series.
Making use of the recently introduced vmlinux.unstripped,
as suggested by Petr, removes any modifications to link-vmlinux.sh
and the issue that this patch tried to address.

> A "stamp file" is traditionally an empty file that is written when some
> build step has completed.  The above code is instead copying the entire
> .tmp_vmlinux1.btf.o file (megabytes in size) to .tmp_vmlinux_btf.stamp.

The goal here was not to only have a reference timestamp, but
specifically the reference file contents.

Note: The duplicated vmlinux.unstripped in its current form is by far
larger than .tmp_vmlinux1.btf.o.

> So, it's not clear to me why the stamp file is needed at all, versus
> depending directly on .tmp_vmlinux1.btf.o.
>
> I guess 'make' doesn't know about the dependencies of
> .tmp_vmlinux1.btf.o.  But the same is true of the stamp file, right?  So
> either way, how would 'make' know to finish rebuilding the file before
> starting to execute the "Re-generate module BTFs" rule?

The problem was not the ordering, this is handled within link-vmlinux.sh.
IIRC originally without this patch even no-op rebuilds would end up
rebuilding the modules. Using .tmp_vmlinux1.btf.o may have worked too.
But in v4, the patch "kbuild: generate module BTF based on
vmlinux.unstripped" was added, which also solves this problem.

> Also, passing the long option '--silent' to 'cmp' creates a dependency
> on the GNU implementation of 'cmp', which isn't documented as a kernel
> build dependency.  Probably better to use the short option '-s'.

Ack.

> Also, the stamp file isn't being deleted by 'make clean'.  It looks like
> it would need to be added to cleanup() in link-vmlinux.sh.

Ack.


Thomas

^ permalink raw reply

* Re: [PATCH v6] lsm: Add LSM hook security_unix_find
From: Justin Suess @ 2026-03-11 12:34 UTC (permalink / raw)
  To: Paul Moore
  Cc: Günther Noack, brauner, demiobenour, fahimitahera, hi, horms,
	ivanov.mikhail1, jannh, jmorris, john.johansen,
	konstantin.meskhidze, linux-security-module, m, matthieu, mic,
	netdev, samasth.norway.ananda, serge, viro
In-Reply-To: <CAHC9VhSu=YtE0dMiEtUQgz0pFy5K8cuHh_EWprNtFTDAxhss_w@mail.gmail.com>

On Tue, Mar 10, 2026 at 06:39:12PM -0400, Paul Moore wrote:
> On Thu, Feb 19, 2026 at 3:26 PM Günther Noack <gnoack3000@gmail.com> wrote:
> > On Thu, Feb 19, 2026 at 03:04:59PM -0500, Justin Suess wrote:
> > > Add a LSM hook security_unix_find.
> > >
> > > This hook is called to check the path of a named unix socket before a
> > > connection is initiated. The peer socket may be inspected as well.
> > >
> > > Why existing hooks are unsuitable:
> > >
> > > Existing socket hooks, security_unix_stream_connect(),
> > > security_unix_may_send(), and security_socket_connect() don't provide
> > > TOCTOU-free / namespace independent access to the paths of sockets.
> > >
> > > (1) We cannot resolve the path from the struct sockaddr in existing hooks.
> > > This requires another path lookup. A change in the path between the
> > > two lookups will cause a TOCTOU bug.
> > >
> > > (2) We cannot use the struct path from the listening socket, because it
> > > may be bound to a path in a different namespace than the caller,
> > > resulting in a path that cannot be referenced at policy creation time.
> > >
> > > Cc: Günther Noack <gnoack3000@gmail.com>
> > > Cc: Tingmao Wang <m@maowtm.org>
> > > Signed-off-by: Justin Suess <utilityemal77@gmail.com>
> > > ---
> > >  include/linux/lsm_hook_defs.h |  5 +++++
> > >  include/linux/security.h      | 11 +++++++++++
> > >  net/unix/af_unix.c            | 13 ++++++++++---
> > >  security/security.c           | 20 ++++++++++++++++++++
> > >  4 files changed, 46 insertions(+), 3 deletions(-)
> 
> ...
> 
> > Reviewed-by: Günther Noack <gnoack3000@gmail.com>
> >
> > Thank you, this looks good. I'll include it in the next version of the
> > Unix connect patch set again.
> 
> I'm looking for this patchset to review/ACK the new hook in context,
> but I'm not seeing it in my inbox or lore.  Did I simply miss the
> patchset or is it still a work in progress?  No worries if it hasn't
> been posted yet, I just wanted to make sure I wasn't holding this up
> any more than I already may have :)
>
Good Morning Paul,

Can't speak to the rest of the patch, but I sent this LSM hook for
review purposes before inclusion with the rest of the V6 of this patch.

Günther added his review tag, but I was asked to make some minor comment / commit
message updates. I sent the same patch, with updated comments/commit to him
in a follow up, off-list email to avoid spamming the list. No code changes were
made, just comments.

I don't think this particular patch will change substantially, unless we find
something unexpected. But the way we use the hook may change (esp wrt to
locking and the SOCK_DEAD state), which is important for your review.

So you may want to hold off your review until the full V6 series gets sent so
you can review the hook in context. There were some questions about
locking that needed proper digging into. [1]

Thank you for your time.

Justin

[1]: https://lore.kernel.org/linux-security-module/20260220.82a8adda6f95@gnoack.org/

> -- 
> paul-moore.com

^ permalink raw reply

* Re: Subject: x86/msr + lockdown: allow access to **documented** RAPL/TCC controls under Secure Boot
From: bauen1 @ 2026-03-11 12:18 UTC (permalink / raw)
  To: Rafael J. Wysocki, Artem S. Tashkinov
  Cc: x86, Linux Kernel Mailing List, linux-pm, linux-efi,
	linux-security-module, Srinivas Pandruvada, Zhang, Rui
In-Reply-To: <CAJZ5v0hokpC_2E77nrm2KHeOdhhH6qvYsg3wCQpTEG=PCim=ww@mail.gmail.com>

On 3/9/26 4:13 PM, Rafael J. Wysocki wrote:
> On Mon, Mar 9, 2026 at 1:24 PM Artem S. Tashkinov <aros@gmx.com> wrote:
>>
>> Hello,
>>
>> When Secure Boot is enabled and kernel lockdown is active, the x86 MSR
>> driver blocks all raw MSR access from user space via `/dev/cpu/*/msr`.
>> This effectively prevents legitimate use of documented CPU power and
>> thermal management interfaces such as RAPL power limits (PL1/PL2) and
>> the TCC/TjOffset control. These registers are part of Intel’s
>> **publicly** documented architectural interface and have been stable
>> across many generations of processors.
> 
> There is a power capping RAPL driver.  What's the problem with it with
> Secure Boot enabled?

Hello,

I believe that the comment about Secure Boot might come from the partially
incorrect documentation of lockdown:

https://lore.kernel.org/linux-security-module/20260203195001.20131-1-hi@alyssa.is/

> -On an EFI-enabled x86 or arm64 machine, lockdown will be automatically enabled
> -if the system boots in EFI Secure Boot mode.

> This is true for Fedora, where this page was sourced from, but I don't
> believe it has ever been true for the mainline kernel, because Linus
> rejected it.



> 
>> As a result, under Secure Boot Linux users lose the ability to read or
>> adjust **standard** power-management controls that remain available
>> through equivalent tooling on other operating systems.
> 
> The power capping RAPL driver is there, please use it.  It is documented even.
> 
> There is also a driver for TCC/TjOffset control, it is called intel_tcc_cooling.
> 
> And there are utilities in user space (for example, Intel thermald)
> that use those interfaces.
> 
>> The current all-or-nothing restriction appears broader than necessary
>> for the stated goal of protecting kernel integrity. MSRs associated with
>> power limits and TCC offset are not privileged debugging or microcode
>> interfaces but standard hardware configuration knobs intended for
>> platform power and thermal management.
>>
>> It would be useful if the kernel either allowed access to a small
>> whitelist of such documented registers under lockdown or exposed a
>> mediated kernel interface for adjusting them. Without such a mechanism,
>> Secure Boot effectively disables legitimate and widely used
>> power/thermal tuning functionality on modern Intel laptops.
>>
>> Most (if not all) Intel laptops don't expose or allow to configure
>> PL1/PL2 limits in BIOS/EFI either.
> 
> Because it is not necessary to do so.
> 


-- 
bauen1

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox