Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* [PATCH v2 1/4] selftests/landlock: Fix snprintf truncation checks in audit helpers
From: Mickaël Salaün @ 2026-04-01 16:14 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, Justin Suess,
	Tingmao Wang
In-Reply-To: <20260401161503.1136946-1-mic@digikod.net>

snprintf() returns the number of characters that would have been
written, excluding the terminating NUL byte.  When the output is
truncated, this return value equals or exceeds the buffer size.  Fix
matches_log_domain_allocated() and matches_log_domain_deallocated() to
detect truncation with ">=" instead of ">".

Cc: Günther Noack <gnoack@google.com>
Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---

Changes since v1:
https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
- New patch (split from the drain fix).
---
 tools/testing/selftests/landlock/audit.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
index 44eb433e9666..1049a0582af5 100644
--- a/tools/testing/selftests/landlock/audit.h
+++ b/tools/testing/selftests/landlock/audit.h
@@ -309,7 +309,7 @@ static int __maybe_unused matches_log_domain_allocated(int audit_fd, pid_t pid,
 
 	log_match_len =
 		snprintf(log_match, sizeof(log_match), log_template, pid);
-	if (log_match_len > sizeof(log_match))
+	if (log_match_len >= sizeof(log_match))
 		return -E2BIG;
 
 	return audit_match_record(audit_fd, AUDIT_LANDLOCK_DOMAIN, log_match,
@@ -326,7 +326,7 @@ static int __maybe_unused matches_log_domain_deallocated(
 
 	log_match_len = snprintf(log_match, sizeof(log_match), log_template,
 				 num_denials);
-	if (log_match_len > sizeof(log_match))
+	if (log_match_len >= sizeof(log_match))
 		return -E2BIG;
 
 	return audit_match_record(audit_fd, AUDIT_LANDLOCK_DOMAIN, log_match,
-- 
2.53.0


^ permalink raw reply related

* [PATCH v2 4/4] selftests/landlock: Skip stale records in audit_match_record()
From: Mickaël Salaün @ 2026-04-01 16:14 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, Justin Suess,
	Tingmao Wang
In-Reply-To: <20260401161503.1136946-1-mic@digikod.net>

Domain deallocation records are emitted asynchronously from kworker
threads (via free_ruleset_work()).  Stale deallocation records from a
previous test can arrive during the current test's deallocation read
loop and be picked up by audit_match_record() instead of the expected
record, causing a domain ID mismatch.  The audit.layers test (which
creates 16 nested domains) is particularly vulnerable because it reads
16 deallocation records in sequence, providing a large window for stale
records to interleave.

The same issue affects audit_flags.signal, where deallocation records
from a previous test (audit.layers) can leak into the next test and be
picked up by audit_match_record() instead of the expected record.

Fix this by continuing to read records when the type matches but the
content pattern does not.  Stale records are silently consumed, and the
loop only stops when both type and pattern match (or the socket times
out with -EAGAIN).

Additionally, extend matches_log_domain_deallocated() with an
expected_domain_id parameter.  When set, the regex pattern includes the
specific domain ID as a literal hex value, so that deallocation records
for a different domain do not match the pattern at all.  This handles
the case where the stale record has the same denial count as the
expected one (e.g. both have denials=1), which the type+pattern loop
alone cannot distinguish.  Callers that already know the expected domain
ID (from a prior denial or allocation record) now pass it to filter
precisely.

When expected_domain_id is set, matches_log_domain_deallocated() also
temporarily increases the socket timeout to audit_tv_dom_drop (1 second)
to wait for the asynchronous kworker deallocation, and restores
audit_tv_default afterward.  This removes the need for callers to manage
the timeout switch manually.

Cc: Günther Noack <gnoack@google.com>
Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---

Changes since v1:
- New patch.
---
 tools/testing/selftests/landlock/audit.h      | 81 ++++++++++++++-----
 tools/testing/selftests/landlock/audit_test.c | 32 ++++----
 2 files changed, 75 insertions(+), 38 deletions(-)

diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
index 74e1c3d763be..b439a9b00f0e 100644
--- a/tools/testing/selftests/landlock/audit.h
+++ b/tools/testing/selftests/landlock/audit.h
@@ -249,9 +249,9 @@ static __maybe_unused char *regex_escape(const char *const src, char *dst,
 static int audit_match_record(int audit_fd, const __u16 type,
 			      const char *const pattern, __u64 *domain_id)
 {
-	struct audit_message msg;
+	struct audit_message msg, last_mismatch = {};
 	int ret, err = 0;
-	bool matches_record = !type;
+	int num_type_match = 0;
 	regmatch_t matches[2];
 	regex_t regex;
 
@@ -259,21 +259,35 @@ static int audit_match_record(int audit_fd, const __u16 type,
 	if (ret)
 		return -EINVAL;
 
-	do {
+	/*
+	 * Reads records until one matches both the expected type and the
+	 * pattern.  Type-matching records with non-matching content are
+	 * silently consumed, which handles stale domain deallocation records
+	 * from a previous test emitted asynchronously by kworker threads.
+	 */
+	while (true) {
 		memset(&msg, 0, sizeof(msg));
 		err = audit_recv(audit_fd, &msg);
-		if (err)
+		if (err) {
+			if (num_type_match) {
+				printf("DATA: %s\n", last_mismatch.data);
+				printf("ERROR: %d record(s) matched type %u"
+				       " but not pattern: %s\n",
+				       num_type_match, type, pattern);
+			}
 			goto out;
+		}
 
-		if (msg.header.nlmsg_type == type)
-			matches_record = true;
-	} while (!matches_record);
+		if (type && msg.header.nlmsg_type != type)
+			continue;
 
-	ret = regexec(&regex, msg.data, ARRAY_SIZE(matches), matches, 0);
-	if (ret) {
-		printf("DATA: %s\n", msg.data);
-		printf("ERROR: no match for pattern: %s\n", pattern);
-		err = -ENOENT;
+		ret = regexec(&regex, msg.data, ARRAY_SIZE(matches), matches,
+			      0);
+		if (!ret)
+			break;
+
+		num_type_match++;
+		last_mismatch = msg;
 	}
 
 	if (domain_id) {
@@ -316,21 +330,48 @@ static int __maybe_unused matches_log_domain_allocated(int audit_fd, pid_t pid,
 				  domain_id);
 }
 
-static int __maybe_unused matches_log_domain_deallocated(
-	int audit_fd, unsigned int num_denials, __u64 *domain_id)
+/*
+ * Matches a domain deallocation record.  When expected_domain_id is non-zero,
+ * the pattern includes the specific domain ID so that stale deallocation
+ * records from a previous test (with a different domain ID) are skipped by
+ * audit_match_record(), and the socket timeout is temporarily increased to
+ * audit_tv_dom_drop to wait for the asynchronous kworker deallocation.
+ */
+static int __maybe_unused
+matches_log_domain_deallocated(int audit_fd, unsigned int num_denials,
+			       __u64 expected_domain_id, __u64 *domain_id)
 {
 	static const char log_template[] = REGEX_LANDLOCK_PREFIX
 		" status=deallocated denials=%u$";
-	char log_match[sizeof(log_template) + 10];
-	int log_match_len;
+	static const char log_template_with_id[] =
+		"^audit([0-9.:]\\+): domain=\\(%llx\\)"
+		" status=deallocated denials=%u$";
+	char log_match[sizeof(log_template_with_id) + 32];
+	int log_match_len, err;
+
+	if (expected_domain_id)
+		log_match_len = snprintf(log_match, sizeof(log_match),
+					 log_template_with_id,
+					 expected_domain_id, num_denials);
+	else
+		log_match_len = snprintf(log_match, sizeof(log_match),
+					 log_template, num_denials);
 
-	log_match_len = snprintf(log_match, sizeof(log_match), log_template,
-				 num_denials);
 	if (log_match_len >= sizeof(log_match))
 		return -E2BIG;
 
-	return audit_match_record(audit_fd, AUDIT_LANDLOCK_DOMAIN, log_match,
-				  domain_id);
+	if (expected_domain_id)
+		setsockopt(audit_fd, SOL_SOCKET, SO_RCVTIMEO,
+			   &audit_tv_dom_drop, sizeof(audit_tv_dom_drop));
+
+	err = audit_match_record(audit_fd, AUDIT_LANDLOCK_DOMAIN, log_match,
+				 domain_id);
+
+	if (expected_domain_id)
+		setsockopt(audit_fd, SOL_SOCKET, SO_RCVTIMEO, &audit_tv_default,
+			   sizeof(audit_tv_default));
+
+	return err;
 }
 
 struct audit_records {
diff --git a/tools/testing/selftests/landlock/audit_test.c b/tools/testing/selftests/landlock/audit_test.c
index f92ba6774faa..a76c3686a0fe 100644
--- a/tools/testing/selftests/landlock/audit_test.c
+++ b/tools/testing/selftests/landlock/audit_test.c
@@ -139,13 +139,16 @@ TEST_F(audit, layers)
 	    WEXITSTATUS(status) != EXIT_SUCCESS)
 		_metadata->exit_code = KSFT_FAIL;
 
-	/* Purges log from deallocated domains. */
-	EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
-				&audit_tv_dom_drop, sizeof(audit_tv_dom_drop)));
+	/*
+	 * Purges log from deallocated domains.  Records arrive in LIFO order
+	 * (innermost domain first) because landlock_put_hierarchy() walks the
+	 * chain sequentially in a single kworker context.
+	 */
 	for (i = ARRAY_SIZE(*domain_stack) - 1; i >= 0; i--) {
 		__u64 deallocated_dom = 2;
 
 		EXPECT_EQ(0, matches_log_domain_deallocated(self->audit_fd, 1,
+							    (*domain_stack)[i],
 							    &deallocated_dom));
 		EXPECT_EQ((*domain_stack)[i], deallocated_dom)
 		{
@@ -154,8 +157,6 @@ TEST_F(audit, layers)
 		}
 	}
 	EXPECT_EQ(0, munmap(domain_stack, sizeof(*domain_stack)));
-	EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
-				&audit_tv_default, sizeof(audit_tv_default)));
 	EXPECT_EQ(0, close(ruleset_fd));
 }
 
@@ -270,13 +271,9 @@ TEST_F(audit, thread)
 	EXPECT_EQ(0, close(pipe_parent[1]));
 	ASSERT_EQ(0, pthread_join(thread, NULL));
 
-	EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
-				&audit_tv_dom_drop, sizeof(audit_tv_dom_drop)));
-	EXPECT_EQ(0, matches_log_domain_deallocated(self->audit_fd, 1,
-						    &deallocated_dom));
+	EXPECT_EQ(0, matches_log_domain_deallocated(
+			     self->audit_fd, 1, denial_dom, &deallocated_dom));
 	EXPECT_EQ(denial_dom, deallocated_dom);
-	EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
-				&audit_tv_default, sizeof(audit_tv_default)));
 }
 
 FIXTURE(audit_flags)
@@ -432,22 +429,21 @@ TEST_F(audit_flags, signal)
 
 	if (variant->restrict_flags &
 	    LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF) {
+		/*
+		 * No deallocation record: denials=0 never matches a real
+		 * record.
+		 */
 		EXPECT_EQ(-EAGAIN,
-			  matches_log_domain_deallocated(self->audit_fd, 0,
+			  matches_log_domain_deallocated(self->audit_fd, 0, 0,
 							 &deallocated_dom));
 		EXPECT_EQ(deallocated_dom, 2);
 	} else {
-		EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
-					&audit_tv_dom_drop,
-					sizeof(audit_tv_dom_drop)));
 		EXPECT_EQ(0, matches_log_domain_deallocated(self->audit_fd, 2,
+							    *self->domain_id,
 							    &deallocated_dom));
 		EXPECT_NE(deallocated_dom, 2);
 		EXPECT_NE(deallocated_dom, 0);
 		EXPECT_EQ(deallocated_dom, *self->domain_id);
-		EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
-					&audit_tv_default,
-					sizeof(audit_tv_default)));
 	}
 }
 
-- 
2.53.0


^ permalink raw reply related

* [PATCH v2 2/4] selftests/landlock: Fix socket file descriptor leaks in audit helpers
From: Mickaël Salaün @ 2026-04-01 16:14 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, Justin Suess,
	Tingmao Wang
In-Reply-To: <20260401161503.1136946-1-mic@digikod.net>

audit_init() opens a netlink socket and configures it, but leaks the
file descriptor if audit_set_status() or setsockopt() fails.  Fix this
by jumping to an error path that closes the socket before returning.

Apply the same fix to audit_init_with_exe_filter(), which leaks the file
descriptor from audit_init() if audit_init_filter_exe() or
audit_filter_exe() fails, and to audit_cleanup(), which leaks it if
audit_init_filter_exe() fails in FIXTURE_TEARDOWN_PARENT().

Cc: Günther Noack <gnoack@google.com>
Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
Link: https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---

Changes since v1:
https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
- New patch (split from the drain fix, extended to
  audit_init_with_exe_filter() and audit_cleanup()).
---
 tools/testing/selftests/landlock/audit.h | 26 +++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
index 1049a0582af5..6422943fc69e 100644
--- a/tools/testing/selftests/landlock/audit.h
+++ b/tools/testing/selftests/landlock/audit.h
@@ -379,19 +379,25 @@ static int audit_init(void)
 
 	err = audit_set_status(fd, AUDIT_STATUS_ENABLED, 1);
 	if (err)
-		return err;
+		goto err_close;
 
 	err = audit_set_status(fd, AUDIT_STATUS_PID, getpid());
 	if (err)
-		return err;
+		goto err_close;
 
 	/* Sets a timeout for negative tests. */
 	err = setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, &audit_tv_default,
 			 sizeof(audit_tv_default));
-	if (err)
-		return -errno;
+	if (err) {
+		err = -errno;
+		goto err_close;
+	}
 
 	return fd;
+
+err_close:
+	close(fd);
+	return err;
 }
 
 static int audit_init_filter_exe(struct audit_filter *filter, const char *path)
@@ -441,8 +447,10 @@ static int audit_cleanup(int audit_fd, struct audit_filter *filter)
 
 		filter = &new_filter;
 		err = audit_init_filter_exe(filter, NULL);
-		if (err)
+		if (err) {
+			close(audit_fd);
 			return err;
+		}
 	}
 
 	/* Filters might not be in place. */
@@ -468,11 +476,15 @@ static int audit_init_with_exe_filter(struct audit_filter *filter)
 
 	err = audit_init_filter_exe(filter, NULL);
 	if (err)
-		return err;
+		goto err_close;
 
 	err = audit_filter_exe(fd, filter, AUDIT_ADD_RULE);
 	if (err)
-		return err;
+		goto err_close;
 
 	return fd;
+
+err_close:
+	close(fd);
+	return err;
 }
-- 
2.53.0


^ permalink raw reply related

* [PATCH v2 3/4] selftests/landlock: Drain stale audit records on init
From: Mickaël Salaün @ 2026-04-01 16:14 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, Justin Suess,
	Tingmao Wang
In-Reply-To: <20260401161503.1136946-1-mic@digikod.net>

Non-audit Landlock tests generate audit records as side effects when
audit_enabled is non-zero (e.g. from boot configuration).  These records
accumulate in the kernel audit backlog while no audit daemon socket is
open.  When the next test opens a new netlink socket and registers as
the audit daemon, the stale backlog is delivered, causing baseline
record count checks to fail spuriously.

Fix this by draining all pending records in audit_init() right after
setting the receive timeout.  The 1-usec SO_RCVTIMEO causes audit_recv()
to return -EAGAIN once the backlog is empty, naturally terminating the
drain loop.

Domain deallocation records are emitted asynchronously from a work
queue, so they may still arrive after the drain.  Remove records.domain
== 0 checks that are not preceded by audit_match_record() calls, which
would otherwise consume stale records before the count.  Document this
constraint above audit_count_records().

Increasing the drain timeout to catch in-flight deallocation records was
considered but rejected: a longer timeout adds latency to every
audit_init() call even when no stale record is pending, and any fixed
timeout is still not guaranteed to catch all records under load.
Removing the unprotected checks is simpler and avoids the spurious
failures.

Cc: Günther Noack <gnoack@google.com>
Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
Link: https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---

Changes since v1:
https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
- Also remove domain checks from audit.trace and
  scoped_audit.connect_to_child.
- Document records.domain == 0 constraint above
  audit_count_records().
- Explain why a longer drain timeout was rejected.
- Drop Reviewed-by (new code comment not in v1).
- Split snprintf and fd leak fixes into separate patches.
---
 tools/testing/selftests/landlock/audit.h      | 19 +++++++++++++++++++
 tools/testing/selftests/landlock/audit_test.c |  2 --
 .../testing/selftests/landlock/ptrace_test.c  |  1 -
 .../landlock/scoped_abstract_unix_test.c      |  1 -
 4 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
index 6422943fc69e..74e1c3d763be 100644
--- a/tools/testing/selftests/landlock/audit.h
+++ b/tools/testing/selftests/landlock/audit.h
@@ -338,6 +338,15 @@ struct audit_records {
 	size_t domain;
 };
 
+/*
+ * WARNING: Do not assert records.domain == 0 without a preceding
+ * audit_match_record() call.  Domain deallocation records are emitted
+ * asynchronously from kworker threads and can arrive after the drain in
+ * audit_init(), corrupting the domain count.  A preceding audit_match_record()
+ * call consumes stale records while scanning, making the assertion safe in
+ * practice because stale deallocation records arrive before the expected access
+ * records.
+ */
 static int audit_count_records(int audit_fd, struct audit_records *records)
 {
 	struct audit_message msg;
@@ -393,6 +402,16 @@ static int audit_init(void)
 		goto err_close;
 	}
 
+	/*
+	 * Drains stale audit records that accumulated in the kernel backlog
+	 * while no audit daemon socket was open.  This happens when non-audit
+	 * Landlock tests generate records while audit_enabled is non-zero (e.g.
+	 * from boot configuration), or when domain deallocation records arrive
+	 * asynchronously after a previous test's socket was closed.
+	 */
+	while (audit_recv(fd, NULL) == 0)
+		;
+
 	return fd;
 
 err_close:
diff --git a/tools/testing/selftests/landlock/audit_test.c b/tools/testing/selftests/landlock/audit_test.c
index 46d02d49835a..f92ba6774faa 100644
--- a/tools/testing/selftests/landlock/audit_test.c
+++ b/tools/testing/selftests/landlock/audit_test.c
@@ -412,7 +412,6 @@ TEST_F(audit_flags, signal)
 		} else {
 			EXPECT_EQ(1, records.access);
 		}
-		EXPECT_EQ(0, records.domain);
 
 		/* Updates filter rules to match the drop record. */
 		set_cap(_metadata, CAP_AUDIT_CONTROL);
@@ -601,7 +600,6 @@ TEST_F(audit_exec, signal_and_open)
 	/* Tests that there was no denial until now. */
 	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
 	EXPECT_EQ(0, records.access);
-	EXPECT_EQ(0, records.domain);
 
 	/*
 	 * Wait for the child to do a first denied action by layer1 and
diff --git a/tools/testing/selftests/landlock/ptrace_test.c b/tools/testing/selftests/landlock/ptrace_test.c
index 4f64c90583cd..1b6c8b53bf33 100644
--- a/tools/testing/selftests/landlock/ptrace_test.c
+++ b/tools/testing/selftests/landlock/ptrace_test.c
@@ -342,7 +342,6 @@ TEST_F(audit, trace)
 	/* Makes sure there is no superfluous logged records. */
 	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
 	EXPECT_EQ(0, records.access);
-	EXPECT_EQ(0, records.domain);
 
 	yama_ptrace_scope = get_yama_ptrace_scope();
 	ASSERT_LE(0, yama_ptrace_scope);
diff --git a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
index 72f97648d4a7..c47491d2d1c1 100644
--- a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
+++ b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
@@ -312,7 +312,6 @@ TEST_F(scoped_audit, connect_to_child)
 	/* Makes sure there is no superfluous logged records. */
 	EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
 	EXPECT_EQ(0, records.access);
-	EXPECT_EQ(0, records.domain);
 
 	ASSERT_EQ(0, pipe2(pipe_child, O_CLOEXEC));
 	ASSERT_EQ(0, pipe2(pipe_parent, O_CLOEXEC));
-- 
2.53.0


^ permalink raw reply related

* [PATCH v2 0/4] Fix Landlock audit test flakiness
From: Mickaël Salaün @ 2026-04-01 16:14 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, linux-security-module, Justin Suess,
	Tingmao Wang

This series fixes two classes of audit selftest failures plus two minor
bugs in the audit test helpers.

The main issue is that domain deallocation audit records are emitted
asynchronously from kworker threads and can arrive after a previous
test's socket has been closed.  This causes two distinct failure modes:

- audit_match_record() picks up a stale deallocation record from a
  previous test instead of the expected one, causing a domain ID
  mismatch.  The audit.layers test (which reads 16 deallocation records
  in sequence) is particularly vulnerable because the large read window
  allows stale records to interleave.  Patch 4 fixes this by filtering
  deallocation records by domain ID and skipping type-matching records
  with wrong content patterns.

- audit_count_records() counts stale deallocation records from a
  previous test, incrementing records.domain from the expected 0 to 1.
  Patch 3 fixes this by draining stale records at audit_init() time and
  removing records.domain == 0 checks that are not preceded by
  audit_match_record() calls (which would consume stale records).

These races are more likely to manifest when additional instrumentation
changes kworker timing in the deallocation path (e.g. with the upcoming
Landlock tracepoints work).

The two minor fixes (patches 1-2) correct a snprintf truncation check
off-by-one and socket file descriptor leaks on error paths in
audit_init(), audit_init_with_exe_filter(), and audit_cleanup().

Patch 1 is an exact subset of the v1 combined patch, which is why it
carries the Reviewed-by tag.  Patches 2 and 3 extend beyond what was in
v1, so the Reviewed-by is not carried.  Patch 4 is new.

Changes since v1:
https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
- Split the combined drain fix into four separate patches.
- Patch 2: extend fd leak fix to audit_init_with_exe_filter() and
  audit_cleanup().
- Patch 3: also remove domain checks from audit.trace and
  scoped_audit.connect_to_child, document constraint, explain why a
  longer drain timeout was rejected.
- Patch 4: new, add domain ID filtering and timeout management to
  matches_log_domain_deallocated(), skip stale records in
  audit_match_record().

Mickaël Salaün (4):
  selftests/landlock: Fix snprintf truncation checks in audit helpers
  selftests/landlock: Fix socket file descriptor leaks in audit helpers
  selftests/landlock: Drain stale audit records on init
  selftests/landlock: Skip stale records in audit_match_record()

 tools/testing/selftests/landlock/audit.h      | 132 ++++++++++++++----
 tools/testing/selftests/landlock/audit_test.c |  34 ++---
 .../testing/selftests/landlock/ptrace_test.c  |   1 -
 .../landlock/scoped_abstract_unix_test.c      |   1 -
 4 files changed, 116 insertions(+), 52 deletions(-)

-- 
2.53.0


^ permalink raw reply

* [PATCH] landlock: Document fallocate(2) as another truncation corner case
From: Günther Noack @ 2026-04-01 15:09 UTC (permalink / raw)
  To: Mickaël Salaün; +Cc: linux-security-module, Günther Noack

Reinforce the already stated policy that LANDLOCK_ACCESS_FS_TRUNCATE should
always go hand in hand with LANDLOCK_ACCESS_FS_WRITE_FILE, as their
meanings and enforcement overlap in counterintuitive ways.

On many common file systems, fallocate(2) offers a way to shorten files as
long as the file is opened for writing, side-stepping the
LANDLOCK_ACCESS_FS_TRUNCATE right.

Assisted-by: Gemini-CLI:gemini-3.1
Signed-off-by: Günther Noack <gnoack@google.com>
---
 Documentation/userspace-api/landlock.rst | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
index 7f86d7a37dc2..d5691ec136cc 100644
--- a/Documentation/userspace-api/landlock.rst
+++ b/Documentation/userspace-api/landlock.rst
@@ -378,8 +378,8 @@ Truncating files
 
 The operations covered by ``LANDLOCK_ACCESS_FS_WRITE_FILE`` and
 ``LANDLOCK_ACCESS_FS_TRUNCATE`` both change the contents of a file and sometimes
-overlap in non-intuitive ways.  It is recommended to always specify both of
-these together.
+overlap in non-intuitive ways.  It is strongly recommended to always specify
+both of these together (either granting both, or granting none).
 
 A particularly surprising example is :manpage:`creat(2)`.  The name suggests
 that this system call requires the rights to create and write files.  However,
@@ -391,6 +391,10 @@ It should also be noted that truncating files does not require the
 system call, this can also be done through :manpage:`open(2)` with the flags
 ``O_RDONLY | O_TRUNC``.
 
+At the same time, on some filesystems, :manpage:`fallocate(2)` offers a way to
+shorten file contents with ``FALLOC_FL_COLLAPSE_RANGE`` when the file is opened
+for writing, sidestepping the ``LANDLOCK_ACCESS_FS_TRUNCATE`` right.
+
 The truncate right is associated with the opened file (see below).
 
 Rights associated with file descriptors
-- 
2.53.0.1185.g05d4b7b318-goog


^ permalink raw reply related

* Re: [PATCH v3 6/9] security: Hornet LSM
From: Paul Moore @ 2026-03-31 23:49 UTC (permalink / raw)
  To: Blaise Boscaccy, Blaise Boscaccy, Jonathan Corbet, James Morris,
	Serge E. Hallyn, Mickaël Salaün, Günther Noack,
	Dr. David Alan Gilbert, Andrew Morton, James.Bottomley, dhowells,
	Fan Wu, Ryan Foster, Randy Dunlap, linux-security-module,
	linux-doc, linux-kernel, bpf
In-Reply-To: <20260326060655.2550595-7-bboscaccy@linux.microsoft.com>

On Mar 26, 2026 Blaise Boscaccy <bboscaccy@linux.microsoft.com> wrote:
> 
> This adds the Hornet Linux Security Module which provides enhanced
> signature verification and data validation for eBPF programs. This
> allows users to continue to maintain an invariant that all code
> running inside of the kernel has actually been signed and verified, by
> the kernel.
> 
> This effort builds upon the currently excepted upstream solution. It
> further hardens it by providing deterministic, in-kernel checking of
> map hashes to solidify auditing along with preventing TOCTOU attacks
> against lskel map hashes.
> 
> Target map hashes are passed in via PKCS#7 signed attributes. Hornet
> determines the extent which the eBFP program is signed and defers to
> other LSMs for policy decisions.
> 
> Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
> Nacked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> ---
>  Documentation/admin-guide/LSM/Hornet.rst | 321 ++++++++++++++++++++++
>  Documentation/admin-guide/LSM/index.rst  |   1 +
>  MAINTAINERS                              |   9 +
>  include/linux/oid_registry.h             |   3 +
>  include/uapi/linux/lsm.h                 |   1 +
>  security/Kconfig                         |   3 +-
>  security/Makefile                        |   1 +
>  security/hornet/Kconfig                  |  11 +
>  security/hornet/Makefile                 |   7 +
>  security/hornet/hornet.asn1              |  13 +
>  security/hornet/hornet_lsm.c             | 333 +++++++++++++++++++++++
>  11 files changed, 702 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/admin-guide/LSM/Hornet.rst
>  create mode 100644 security/hornet/Kconfig
>  create mode 100644 security/hornet/Makefile
>  create mode 100644 security/hornet/hornet.asn1
>  create mode 100644 security/hornet/hornet_lsm.c

...

> +static int hornet_check_program(struct bpf_prog *prog, union bpf_attr *attr,
> +				struct bpf_token *token, bool is_kernel,
> +				enum lsm_integrity_verdict *verdict)
> +{
> +	struct hornet_maps maps = {0};
> +	bpfptr_t usig = make_bpfptr(attr->signature, is_kernel);
> +	struct pkcs7_message *msg;
> +	struct hornet_parse_context *ctx;
> +	void *sig;
> +	int err;
> +	const void *authattrs;
> +	size_t authattrs_len;
> +
> +	if (!attr->signature) {
> +		*verdict = LSM_INT_VERDICT_UNSIGNED;
> +		return 0;
> +	}
> +
> +	ctx = kzalloc(sizeof(struct hornet_parse_context), GFP_KERNEL);
> +	if (!ctx)
> +		return -ENOMEM;
> +
> +	maps.fd_array = make_bpfptr(attr->fd_array, is_kernel);
> +	sig = kzalloc(attr->signature_size, GFP_KERNEL);
> +	if (!sig) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +	err = copy_from_bpfptr(sig, usig, attr->signature_size);
> +	if (err != 0)
> +		goto cleanup_sig;
> +
> +	msg = pkcs7_parse_message(sig, attr->signature_size);
> +	if (IS_ERR(msg)) {
> +		err = LSM_INT_VERDICT_BADSIG;
> +		goto cleanup_sig;
> +	}
> +
> +	if (verify_pkcs7_message_sig(prog->insnsi, prog->len * sizeof(struct bpf_insn), msg,
> +				     VERIFY_USE_SECONDARY_KEYRING,
> +				     VERIFYING_BPF_SIGNATURE,
> +				     NULL, NULL)) {
> +		err = LSM_INT_VERDICT_UNKNOWNKEY;
> +		goto cleanup_msg;
> +	}

Given that kernel module signatures are verified with
VERIFY_USE_SECONDARY_KEYRING it's reasonable to do the same here in
Hornet.  I suspect most users concerned about code integrity, especially
code running in the kernel's context, will likely want to verify BPF
programs with the secondary keyring.

However, as we've seen from prior discussions, there is a desire among
some users to support arbitrary keyrings, and we should find a way to
support that in some configuration.

If we take a similar approach to bpf_verify_pkcs7_signature() and take
the keyring from attr->keyring_id, LSMs that provide enforcement via the
bpf_prog_load_post_integrity callback should be able to check the
keyring_id as part of their decision making and respond accordingly.  Do
we need to worry about a malicious userspace modifying attr at this
point?  I think the answer is "no", but I didn't chase it through the
code to be sure.

I suppose there might be a need for a yama-esque LSM which only provides
a bpf_prog_load_post_integrity callback and ensures a valid signature
verified against the VERIFY_USE_SECONDARY_KEYRING without the need for
any other policy or tunables, but let's see what the v4 revision looks
like first.  We can always add this later if needed, and it could live
within the Hornet dir (similar to how the integrity directory hosts
both the IMA and EVM LSMs).

> +	if (pkcs7_get_authattr(msg, OID_hornet_data,
> +			       &authattrs, &authattrs_len) == -ENODATA) {
> +		err = LSM_INT_VERDICT_PARTIALSIG;
> +		goto cleanup_msg;
> +	}
> +
> +	err = asn1_ber_decoder(&hornet_decoder, ctx, authattrs, authattrs_len);
> +	if (err < 0 || authattrs == NULL) {
> +		err = LSM_INT_VERDICT_BADSIG;
> +		goto cleanup_msg;
> +	}
> +
> +	err = hornet_verify_hashes(&maps, ctx, prog);
> +
> +cleanup_msg:
> +	pkcs7_free_message(msg);
> +cleanup_sig:
> +	kfree(sig);
> +out:
> +	kfree(ctx);
> +	return err;
> +}

--
paul-moore.com

^ permalink raw reply

* Re: [PATCH v3 4/9] lsm: framework for BPF integrity verification
From: Paul Moore @ 2026-03-31 22:04 UTC (permalink / raw)
  To: Song Liu
  Cc: Blaise Boscaccy, Jonathan Corbet, James Morris, Serge E. Hallyn,
	Mickaël Salaün, Günther Noack,
	Dr. David Alan Gilbert, Andrew Morton, James.Bottomley, dhowells,
	Fan Wu, Ryan Foster, Randy Dunlap, linux-security-module,
	linux-doc, linux-kernel, bpf
In-Reply-To: <CAPhsuW6fWG8674+LOAACqb1LDAz17w-CxvwBaTf7JXQL6ip5Pg@mail.gmail.com>

On Fri, Mar 27, 2026 at 2:25 PM Song Liu <song@kernel.org> wrote:
> On Fri, Mar 27, 2026 at 10:54 AM Blaise Boscaccy
> <bboscaccy@linux.microsoft.com> wrote:
> >
> > Song Liu <song@kernel.org> writes:
> >
> > > On Wed, Mar 25, 2026 at 11:07 PM Blaise Boscaccy
> > > <bboscaccy@linux.microsoft.com> wrote:
> > > [...]
> > >> The first new callback, bpf_prog_load_integrity(), located within the
> > >> security_bpf_prog_load() hook, is necessary to ensure that the integrity
> > >> verification callbacks are executed before any of the existing LSMs
> > >> are executed via the bpf_prog_load() callback.  Reusing the existing
> > >> bpf_prog_load() callback for integrity verification could result in LSMs
> > >> not having access to the integrity verification results when asked to
> > >> authorize the BPF program load in the bpf_prog_load() callback.
> > >>
> > >> The new LSM hook, security_bpf_prog_load_post_integrity(), is intended
> > >> to be called from within LSMs performing BPF program integrity
> > >> verification.  It is used to report the verdict of the integrity
> > >> verification to other LSMs enforcing access control policy on BPF
> > >> program loads.  LSMs enforcing such access controls should register a
> > >> bpf_prog_load_post_integrity() callback to receive integrity verdicts.
> > >
> > > bpf_prog_load_post_integrity() is weird. Some questions about it:
> > >
> > > 1. Is it possible to call it from other LSMs (not hornet)? Specifically, is it
> > >    possible to call it from BPF LSM?
> >
> > There is nothing hornet exclusive about that security hook. If the BPF
> > LSM folks wanted to use it they would probably need to implement a
> > kfunc to invoke it.
>
> Please also include the kfunc in v4.

Blaise is welcome to provide a kfunc for
bpf_prog_load_post_integrity(), but I don't see that as a requirement
for Hornet's acceptance.  If a developer wanted to write a service
LSM, like Hornet, in BPF to verify a BPF program's integrity, that
developer would be responsible for implementing the kfunc.  If you are
interested in doing that, I suggest you talk with KP as he is the BPF
LSM maintainer and I suspect he may have some concerns around
supporting that (see prior discussions around BPF signature
verification and his own implementation of BPF signature
verification).

As a reminder, if you want to apply security policy to a BPF program
load operation without considering the program's integrity or
provenance, you can do that today with the
security_bpf_prog_load()/bpf_prog_load hook/callback combination.

However, as you pointed out, we ultimately need to see at least one
LSM providing a callback for the bpf_prog_load_post_integrity
callback.  I wrote a toy SELinux implementation when I was playing
with Hornet a couple of revisions ago, which could serve as the basis
for a proper patch if needed, but my understanding is that Blaise has
been working with Fan on an IPE implementation.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v3 1/2] lsm: add backing_file LSM hooks
From: Paul Moore @ 2026-03-31  2:13 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Gao Xiang, Christian Brauner
In-Reply-To: <CAOQ4uxgcOCP8cf8KvCsC=5OiuRvULKOf52mc2n9qEBAhPKoUGg@mail.gmail.com>

On Mon, Mar 30, 2026 at 4:35 AM Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, Mar 27, 2026 at 11:05 PM Paul Moore <paul@paul-moore.com> wrote:
> >
> > Stacked filesystems such as overlayfs do not currently provide the
> > necessary mechanisms for LSMs to properly enforce access controls on the
> > mmap() and mprotect() operations.  In order to resolve this gap, a LSM
> > security blob is being added to the backing_file struct and the following
> > new LSM hooks are being created ...

...

> > diff --git a/fs/file_table.c b/fs/file_table.c
> > index aaa5faaace1e..0bdc26cae138 100644
> > --- a/fs/file_table.c
> > +++ b/fs/file_table.c
> > @@ -50,6 +50,7 @@ struct backing_file {
> >                 struct path user_path;
> >                 freeptr_t bf_freeptr;
> >         };
> > +       void *security;
>
> This needs ifdef SECURITY

Yep.  That will require some other changes, but I should be able to
keep those fairly limited.

> and the name should be user_security

I'd strongly prefer to keep it as "security" both because "void
*security" is a very common pattern in the kernel when adding LSM
blobs to structs, and we don't know what each and every LSM may need
to store in the backing_file LSM blob.

> > +void *backing_file_security(const struct file *f)
> > +{
> > +       return backing_file(f)->security;
> > +}
>
> I prefer the name backing_file_user_security()
>
> Terminology here is very confusing but when saying
> "backing file" it is more natural that one is referring to the
> backing xfs file with overlayfs has opened.
>
> The "backing file" already has an LSM blob f->f_security
> which is fair the call it the "backing file's LSM blob" ...

From a LSM dev's perspective, I would call file->f_security blob a
file's LSM blob regardless of if the file is a user or backing file;
the backing_file->security blob would be a backing file LSM blob.  If
you look at what the SELinux code does, specifically in the
__file_has_perm() and __file_map_prot_check() functions (newly added
in this patchset), you'll see this supported by the code: the
file->f_security field is basically handled the same regardless of if
it is a user or backing file whereas the backing_file->security field
is handled very differently because it is a backing file.

While I think it makes sense to match the accessor function's name to
the struct field, ultimately I care more about the struct field's
name.  If you really feel strongly about changing
backing_file_security() to backing_file_user_security() I can live
with that so long as we keep backing_file->security intact.

> > @@ -73,8 +79,11 @@ static inline void file_free(struct file *f)
> >                 percpu_counter_dec(&nr_files);
> >         put_cred(f->f_cred);
> >         if (unlikely(f->f_mode & FMODE_BACKING)) {
> > -               path_put(backing_file_user_path(f));
> > -               kmem_cache_free(bfilp_cachep, backing_file(f));
> > +               struct backing_file *ff = backing_file(f);
> > +
> > +               security_backing_file_free(&ff->security);
> > +               path_put(&ff->user_path);
> > +               kmem_cache_free(bfilp_cachep, ff);
>
> Not directly related to your patch, but as this is growing, IMO
> this would look cleaner with backing_file_free() inline helper
> (see attached path).

Sure, I'll include your patch in the patchset.  I'll also fix the
comment style in the patch to match C style in the rest of the file
(I'll note the change in the metadata).

> >         } else {
> >                 kmem_cache_free(filp_cachep, f);
> >         }
> > @@ -290,7 +299,8 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
> >   * This is only for kernel internal use, and the allocate file must not be
> >   * installed into file tables or such.
> >   */
> > -struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
> > +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> > +                                     const struct file *user_file)
> >  {
> >         struct backing_file *ff;
> >         int error;
> > @@ -306,6 +316,11 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
> >         }
> >
> >         ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
> > +       error = security_backing_file_alloc(&ff->security, user_file);
> > +       if (unlikely(error)) {
> > +               fput(&ff->file);
> > +               return ERR_PTR(error);
> > +       }
> >         return &ff->file;
> >  }
>
> There is an API issue here.
> in order to call fput() we must ensure that user_security was initialized to
> NULL (or allocated).
>
> I don't think that we want security_backing_file_alloc() to provide this
> semantic and the current patch does not implement it.
>
> Furthermore, user_path is also not initialized in the error case.
>
> Attached UNTESTED fixup patch to suggest a cleanup with
> init_backing_file() helper.
>
> It also changes the variable and helper name to user_security
> and plays some trick to avoid many ifdef SECURITY.
> Feel free to take whichever bits you like with/without attribution.

I think I've got a cleaner approach than what you've proposed, let me
code it up, test it all, and I'll post a new revision of the patchset
soon.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH 6/7] tomoyo: Convert from sb_mount to granular mount hooks
From: Song Liu @ 2026-03-30 20:04 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Song Liu, Christian Brauner, viro@zeniv.linux.org.uk,
	paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com,
	jack@suse.cz, john.johansen@canonical.com,
	stephen.smalley.work@gmail.com, omosnace@redhat.com,
	mic@digikod.net, gnoack@google.com, takedakn@nttdata.co.jp,
	herton@canonical.com, Kernel Team, selinux@vger.kernel.org,
	apparmor@lists.ubuntu.com, linux-fsdevel@vger.kernel.org,
	linux-security-module@vger.kernel.org
In-Reply-To: <4f5d1b1f-ecb2-421a-8a46-36c7a12d48de@I-love.SAKURA.ne.jp>

Hi Tetsuo,

On Tue, Mar 24, 2026 at 6:02 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
[...]
> >
> > If I understand Christian correctly, the main challenge here is that
> > FS_REQUIRES_DEV doesn't imply fc->source is the path of a device.
>
> Correct. FS_REQUIRES_DEV no longer implies that fc->source is a pathname.
>
> > Changing this assumption is a major change between VFS and many
> > filesystems.
>
> Wrong. I'm not trying to change this assumption. I'm trying to move LSM hook
> to a location after fc->source was interpreted by individual filesystem.

After spending some more time on this, I think we should not add
fc->source_path. This is because in many cases, dev_name for a new mount
is not a dev, or a path. For example, I can create a btrfs raid0 volume on two
devices, and use either of them as dev_name to mount the volume. It is not
accurate to put the path of either device in fc->source_path.

OTOH, if a LSM really wants to monitor the devices used by this mount,
security_sb_kern_mount() is a better hook to use. This will require the
LSM understands s_fs_info for specific file systems, which is not easy but
necessary.

For now, I think it makes sense to keep security_mount_new() in this set
as-is, and let LSMs like tomoyo and apparmor call kern_path when
necessary.

Thanks,
Song

^ permalink raw reply

* Re: [PATCH v8 01/12] lsm: Add LSM hook security_unix_find
From: Günther Noack @ 2026-03-30 19:02 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Paul Moore, John Johansen, Georgia Garcia, James Morris,
	Serge E . Hallyn, Tingmao Wang, Justin Suess,
	linux-security-module, Samasth Norway Ananda, Matthieu Buffet,
	Mikhail Ivanov, konstantin.meskhidze, Demi Marie Obenour,
	Alyssa Ross, Jann Horn, Tahera Fahimi, Sebastian Andrzej Siewior,
	Kuniyuki Iwashima, Simon Horman, netdev, Alexander Viro,
	Christian Brauner
In-Reply-To: <20260330.ie1eth0ex9Pa@digikod.net>

On Mon, Mar 30, 2026 at 06:02:05PM +0200, Mickaël Salaün wrote:
> On Fri, Mar 27, 2026 at 01:55:58PM -0400, Paul Moore wrote:
> > On Fri, Mar 27, 2026 at 12:49 PM Günther Noack <gnoack3000@gmail.com> wrote:
> > > Cc: Günther Noack <gnoack3000@gmail.com>
> > > Cc: Tingmao Wang <m@maowtm.org>
> > > Cc: Mickaël Salaün <mic@digikod.net>
> > > Cc: Paul Moore <paul@paul-moore.com>
> > > Signed-off-by: Justin Suess <utilityemal77@gmail.com>
> > > Signed-off-by: Günther Noack <gnoack3000@gmail.com>
> > > ---
> > >  include/linux/lsm_hook_defs.h |  5 +++++
> > >  include/linux/security.h      | 11 +++++++++++
> > >  net/unix/af_unix.c            | 10 +++++++---
> > >  security/security.c           | 20 ++++++++++++++++++++
> > >  4 files changed, 43 insertions(+), 3 deletions(-)
> > 
> > This patch doesn't look like it changed significantly in this
> > revision, is there a reason you dropped the tags from Georgia and I?
> 
> You'r right, the patch didn't change at all. I added Georgia's tag in my
> -next branch for the previous version, I guess Günther forgot to add it
> for this version, but I updated my branch with the same tag, so it's
> still there.  Thank you both BTW!

I only missed to add them by accident, thanks for adding it back,
Mickaël, and Paul for spotting it!


> I just included a one-line fix because of the m68k warning, we'll see if
> it works as expected, and we should be good to go.  It would be nice to
> have John's feedback though.

Agreed.  John, I would also appreciate if you could find the time to
have a look.

–Günther

^ permalink raw reply

* Re: [PATCH v8 03/12] landlock: Replace union access_masks_all with helper functions
From: Günther Noack @ 2026-03-30 19:00 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: John Johansen, kernel test robot, linux-security-module,
	Tingmao Wang, Justin Suess, Samasth Norway Ananda,
	Matthieu Buffet, Mikhail Ivanov, konstantin.meskhidze,
	Demi Marie Obenour, Alyssa Ross, Jann Horn, Tahera Fahimi,
	Sebastian Andrzej Siewior, Kuniyuki Iwashima, Georgia Garcia
In-Reply-To: <20260330.laew6AthooD6@digikod.net>

On Mon, Mar 30, 2026 at 12:53:21PM +0200, Mickaël Salaün wrote:
> On Mon, Mar 30, 2026 at 11:56:40AM +0200, Mickaël Salaün wrote:
> > On Fri, Mar 27, 2026 at 05:48:28PM +0100, Günther Noack wrote:
> > > * Stop using a union for access_masks_all.
> > > * Expose helper functions for intersection checks and union operations.
> > > 
> > > The memory layout of bitfields is only loosely defined by the C
> > > standard, so our static assertion that expects a fixed size was
> > > brittle, and it broke on some compilers when we attempted to add a
> > > 17th file system access right.
> > > 
> > > Reported-by: kernel test robot <lkp@intel.com>
> > > Closes: https://lore.kernel.org/oe-kbuild-all/202603261438.jBx2DGNe-lkp@intel.com/
> > > Signed-off-by: Günther Noack <gnoack3000@gmail.com>
> > > ---
> > >  security/landlock/access.h  | 21 ++++++++++++++-------
> > >  security/landlock/cred.h    | 10 ++--------
> > >  security/landlock/ruleset.h | 13 ++++---------
> > >  3 files changed, 20 insertions(+), 24 deletions(-)
> > 
> > I'd prefer this approach:
> > 
> > diff --git a/security/landlock/access.h b/security/landlock/access.h
> > index 89dc8e7b93da..bc9efbb5c900 100644
> > --- a/security/landlock/access.h
> > +++ b/security/landlock/access.h
> > @@ -50,7 +50,7 @@ struct access_masks {
> >         access_mask_t fs : LANDLOCK_NUM_ACCESS_FS;
> >         access_mask_t net : LANDLOCK_NUM_ACCESS_NET;
> >         access_mask_t scope : LANDLOCK_NUM_SCOPE;
> > -};
> > +} __packed;
> 
> Actually, we can just use '__packed __aligned(sizeof(u32))' and avoid
> the static_assert change.  That would have no impact on x86, but pack it
> on m68k.

Thanks, good catch (and thanks for pushing it to mic-next).
Fingers crossed that this works on m68k.

–Günther

^ permalink raw reply

* Re: [PATCH v8 01/12] lsm: Add LSM hook security_unix_find
From: Mickaël Salaün @ 2026-03-30 16:02 UTC (permalink / raw)
  To: Paul Moore, John Johansen, Georgia Garcia
  Cc: Günther Noack, James Morris, Serge E . Hallyn, Tingmao Wang,
	Justin Suess, linux-security-module, Samasth Norway Ananda,
	Matthieu Buffet, Mikhail Ivanov, konstantin.meskhidze,
	Demi Marie Obenour, Alyssa Ross, Jann Horn, Tahera Fahimi,
	Sebastian Andrzej Siewior, Kuniyuki Iwashima, Simon Horman,
	netdev, Alexander Viro, Christian Brauner
In-Reply-To: <CAHC9VhRktViRp_U3ZAvo2e7vbjUd6RZb-y8+YPXhtLpt7hsXYg@mail.gmail.com>

On Fri, Mar 27, 2026 at 01:55:58PM -0400, Paul Moore wrote:
> On Fri, Mar 27, 2026 at 12:49 PM Günther Noack <gnoack3000@gmail.com> wrote:
> >
> > From: Justin Suess <utilityemal77@gmail.com>
> >
> > Add an LSM hook security_unix_find.
> >
> > This hook is called to check the path of a named UNIX socket before a
> > connection is initiated. The peer socket may be inspected as well.
> >
> > Why existing hooks are unsuitable:
> >
> > Existing socket hooks, security_unix_stream_connect(),
> > security_unix_may_send(), and security_socket_connect() don't provide
> > TOCTOU-free / namespace independent access to the paths of sockets.
> >
> > (1) We cannot resolve the path from the struct sockaddr in existing hooks.
> > This requires another path lookup. A change in the path between the
> > two lookups will cause a TOCTOU bug.
> >
> > (2) We cannot use the struct path from the listening socket, because it
> > may be bound to a path in a different namespace than the caller,
> > resulting in a path that cannot be referenced at policy creation time.
> >
> > Consumers of the hook wishing to reference @other are responsible
> > for acquiring the unix_state_lock and checking for the SOCK_DEAD flag
> > therein, ensuring the socket hasn't died since lookup.
> >
> > Cc: Günther Noack <gnoack3000@gmail.com>
> > Cc: Tingmao Wang <m@maowtm.org>
> > Cc: Mickaël Salaün <mic@digikod.net>
> > Cc: Paul Moore <paul@paul-moore.com>
> > Signed-off-by: Justin Suess <utilityemal77@gmail.com>
> > Signed-off-by: Günther Noack <gnoack3000@gmail.com>
> > ---
> >  include/linux/lsm_hook_defs.h |  5 +++++
> >  include/linux/security.h      | 11 +++++++++++
> >  net/unix/af_unix.c            | 10 +++++++---
> >  security/security.c           | 20 ++++++++++++++++++++
> >  4 files changed, 43 insertions(+), 3 deletions(-)
> 
> This patch doesn't look like it changed significantly in this
> revision, is there a reason you dropped the tags from Georgia and I?

You'r right, the patch didn't change at all. I added Georgia's tag in my
-next branch for the previous version, I guess Günther forgot to add it
for this version, but I updated my branch with the same tag, so it's
still there.  Thank you both BTW!

I just included a one-line fix because of the m68k warning, we'll see if
it works as expected, and we should be good to go.  It would be nice to
have John's feedback though.

^ permalink raw reply

* Re: [PATCH v8 03/12] landlock: Replace union access_masks_all with helper functions
From: Mickaël Salaün @ 2026-03-30 10:53 UTC (permalink / raw)
  To: Günther Noack
  Cc: John Johansen, kernel test robot, linux-security-module,
	Tingmao Wang, Justin Suess, Samasth Norway Ananda,
	Matthieu Buffet, Mikhail Ivanov, konstantin.meskhidze,
	Demi Marie Obenour, Alyssa Ross, Jann Horn, Tahera Fahimi,
	Sebastian Andrzej Siewior, Kuniyuki Iwashima, Georgia Garcia
In-Reply-To: <20260330.ohsoe7vu2Nob@digikod.net>

On Mon, Mar 30, 2026 at 11:56:40AM +0200, Mickaël Salaün wrote:
> On Fri, Mar 27, 2026 at 05:48:28PM +0100, Günther Noack wrote:
> > * Stop using a union for access_masks_all.
> > * Expose helper functions for intersection checks and union operations.
> > 
> > The memory layout of bitfields is only loosely defined by the C
> > standard, so our static assertion that expects a fixed size was
> > brittle, and it broke on some compilers when we attempted to add a
> > 17th file system access right.
> > 
> > Reported-by: kernel test robot <lkp@intel.com>
> > Closes: https://lore.kernel.org/oe-kbuild-all/202603261438.jBx2DGNe-lkp@intel.com/
> > Signed-off-by: Günther Noack <gnoack3000@gmail.com>
> > ---
> >  security/landlock/access.h  | 21 ++++++++++++++-------
> >  security/landlock/cred.h    | 10 ++--------
> >  security/landlock/ruleset.h | 13 ++++---------
> >  3 files changed, 20 insertions(+), 24 deletions(-)
> 
> I'd prefer this approach:
> 
> diff --git a/security/landlock/access.h b/security/landlock/access.h
> index 89dc8e7b93da..bc9efbb5c900 100644
> --- a/security/landlock/access.h
> +++ b/security/landlock/access.h
> @@ -50,7 +50,7 @@ struct access_masks {
>         access_mask_t fs : LANDLOCK_NUM_ACCESS_FS;
>         access_mask_t net : LANDLOCK_NUM_ACCESS_NET;
>         access_mask_t scope : LANDLOCK_NUM_SCOPE;
> -};
> +} __packed;

Actually, we can just use '__packed __aligned(sizeof(u32))' and avoid
the static_assert change.  That would have no impact on x86, but pack it
on m68k.

> 
>  union access_masks_all {
>         struct access_masks masks;
> @@ -58,7 +58,7 @@ union access_masks_all {
>  };
> 
>  /* Makes sure all fields are covered. */
> -static_assert(sizeof(typeof_member(union access_masks_all, masks)) ==
> +static_assert(sizeof(typeof_member(union access_masks_all, masks)) <=
>               sizeof(typeof_member(union access_masks_all, all)));
> 
>  /**
> 
> 
> This keep the check and make sure new field will not introduce issues, but more
> importantly it save memory, which was one of the goal of struct access_masks.
> 
> If that's good with you I'll replace your patch and squash this packed
> annotation with the following patch.
> 
> 
> > 
> > diff --git a/security/landlock/access.h b/security/landlock/access.h
> > index 42c95747d7bd..277b6ed7f7bb 100644
> > --- a/security/landlock/access.h
> > +++ b/security/landlock/access.h
> > @@ -52,14 +52,21 @@ struct access_masks {
> >  	access_mask_t scope : LANDLOCK_NUM_SCOPE;
> >  };
> >  
> > -union access_masks_all {
> > -	struct access_masks masks;
> > -	u32 all;
> > -};
> > +/* Checks whether two access masks have any common bit set. */
> > +static inline bool access_masks_intersect(const struct access_masks a,
> > +					  const struct access_masks b)
> > +{
> > +	return (a.fs & b.fs) || (a.net & b.net) || (a.scope & b.scope);
> > +}
> >  
> > -/* Makes sure all fields are covered. */
> > -static_assert(sizeof(typeof_member(union access_masks_all, masks)) ==
> > -	      sizeof(typeof_member(union access_masks_all, all)));
> > +/* ORs the bits of @src into @dst. */
> > +static inline void access_masks_merge(struct access_masks *dst,
> > +				      const struct access_masks src)
> > +{
> > +	dst->fs |= src.fs;
> > +	dst->net |= src.net;
> > +	dst->scope |= src.scope;
> > +}
> >  
> >  /**
> >   * struct layer_access_masks - A boolean matrix of layers and access rights
> > diff --git a/security/landlock/cred.h b/security/landlock/cred.h
> > index f287c56b5fd4..207a6db1c086 100644
> > --- a/security/landlock/cred.h
> > +++ b/security/landlock/cred.h
> > @@ -123,9 +123,6 @@ landlock_get_applicable_subject(const struct cred *const cred,
> >  				const struct access_masks masks,
> >  				size_t *const handle_layer)
> >  {
> > -	const union access_masks_all masks_all = {
> > -		.masks = masks,
> > -	};
> >  	const struct landlock_ruleset *domain;
> >  	ssize_t layer_level;
> >  
> > @@ -138,11 +135,8 @@ landlock_get_applicable_subject(const struct cred *const cred,
> >  
> >  	for (layer_level = domain->num_layers - 1; layer_level >= 0;
> >  	     layer_level--) {
> > -		union access_masks_all layer = {
> > -			.masks = domain->access_masks[layer_level],
> > -		};
> > -
> > -		if (layer.all & masks_all.all) {
> > +		if (access_masks_intersect(domain->access_masks[layer_level],
> > +					   masks)) {
> >  			if (handle_layer)
> >  				*handle_layer = layer_level;
> >  
> > diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
> > index 889f4b30301a..9f8b33815c2c 100644
> > --- a/security/landlock/ruleset.h
> > +++ b/security/landlock/ruleset.h
> > @@ -229,18 +229,13 @@ static inline void landlock_get_ruleset(struct landlock_ruleset *const ruleset)
> >  static inline struct access_masks
> >  landlock_union_access_masks(const struct landlock_ruleset *const domain)
> >  {
> > -	union access_masks_all matches = {};
> > +	struct access_masks matches = {};
> >  	size_t layer_level;
> >  
> > -	for (layer_level = 0; layer_level < domain->num_layers; layer_level++) {
> > -		union access_masks_all layer = {
> > -			.masks = domain->access_masks[layer_level],
> > -		};
> > +	for (layer_level = 0; layer_level < domain->num_layers; layer_level++)
> > +		access_masks_merge(&matches, domain->access_masks[layer_level]);
> >  
> > -		matches.all |= layer.all;
> > -	}
> > -
> > -	return matches.masks;
> > +	return matches;
> >  }
> >  
> >  static inline void
> > -- 
> > 2.53.0
> > 
> > 

^ permalink raw reply

* Re: [PATCH v8 03/12] landlock: Replace union access_masks_all with helper functions
From: Mickaël Salaün @ 2026-03-30  9:56 UTC (permalink / raw)
  To: Günther Noack
  Cc: John Johansen, kernel test robot, linux-security-module,
	Tingmao Wang, Justin Suess, Samasth Norway Ananda,
	Matthieu Buffet, Mikhail Ivanov, konstantin.meskhidze,
	Demi Marie Obenour, Alyssa Ross, Jann Horn, Tahera Fahimi,
	Sebastian Andrzej Siewior, Kuniyuki Iwashima, Georgia Garcia
In-Reply-To: <20260327164838.38231-4-gnoack3000@gmail.com>

On Fri, Mar 27, 2026 at 05:48:28PM +0100, Günther Noack wrote:
> * Stop using a union for access_masks_all.
> * Expose helper functions for intersection checks and union operations.
> 
> The memory layout of bitfields is only loosely defined by the C
> standard, so our static assertion that expects a fixed size was
> brittle, and it broke on some compilers when we attempted to add a
> 17th file system access right.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202603261438.jBx2DGNe-lkp@intel.com/
> Signed-off-by: Günther Noack <gnoack3000@gmail.com>
> ---
>  security/landlock/access.h  | 21 ++++++++++++++-------
>  security/landlock/cred.h    | 10 ++--------
>  security/landlock/ruleset.h | 13 ++++---------
>  3 files changed, 20 insertions(+), 24 deletions(-)

I'd prefer this approach:

diff --git a/security/landlock/access.h b/security/landlock/access.h
index 89dc8e7b93da..bc9efbb5c900 100644
--- a/security/landlock/access.h
+++ b/security/landlock/access.h
@@ -50,7 +50,7 @@ struct access_masks {
        access_mask_t fs : LANDLOCK_NUM_ACCESS_FS;
        access_mask_t net : LANDLOCK_NUM_ACCESS_NET;
        access_mask_t scope : LANDLOCK_NUM_SCOPE;
-};
+} __packed;

 union access_masks_all {
        struct access_masks masks;
@@ -58,7 +58,7 @@ union access_masks_all {
 };

 /* Makes sure all fields are covered. */
-static_assert(sizeof(typeof_member(union access_masks_all, masks)) ==
+static_assert(sizeof(typeof_member(union access_masks_all, masks)) <=
              sizeof(typeof_member(union access_masks_all, all)));

 /**


This keep the check and make sure new field will not introduce issues, but more
importantly it save memory, which was one of the goal of struct access_masks.

If that's good with you I'll replace your patch and squash this packed
annotation with the following patch.


> 
> diff --git a/security/landlock/access.h b/security/landlock/access.h
> index 42c95747d7bd..277b6ed7f7bb 100644
> --- a/security/landlock/access.h
> +++ b/security/landlock/access.h
> @@ -52,14 +52,21 @@ struct access_masks {
>  	access_mask_t scope : LANDLOCK_NUM_SCOPE;
>  };
>  
> -union access_masks_all {
> -	struct access_masks masks;
> -	u32 all;
> -};
> +/* Checks whether two access masks have any common bit set. */
> +static inline bool access_masks_intersect(const struct access_masks a,
> +					  const struct access_masks b)
> +{
> +	return (a.fs & b.fs) || (a.net & b.net) || (a.scope & b.scope);
> +}
>  
> -/* Makes sure all fields are covered. */
> -static_assert(sizeof(typeof_member(union access_masks_all, masks)) ==
> -	      sizeof(typeof_member(union access_masks_all, all)));
> +/* ORs the bits of @src into @dst. */
> +static inline void access_masks_merge(struct access_masks *dst,
> +				      const struct access_masks src)
> +{
> +	dst->fs |= src.fs;
> +	dst->net |= src.net;
> +	dst->scope |= src.scope;
> +}
>  
>  /**
>   * struct layer_access_masks - A boolean matrix of layers and access rights
> diff --git a/security/landlock/cred.h b/security/landlock/cred.h
> index f287c56b5fd4..207a6db1c086 100644
> --- a/security/landlock/cred.h
> +++ b/security/landlock/cred.h
> @@ -123,9 +123,6 @@ landlock_get_applicable_subject(const struct cred *const cred,
>  				const struct access_masks masks,
>  				size_t *const handle_layer)
>  {
> -	const union access_masks_all masks_all = {
> -		.masks = masks,
> -	};
>  	const struct landlock_ruleset *domain;
>  	ssize_t layer_level;
>  
> @@ -138,11 +135,8 @@ landlock_get_applicable_subject(const struct cred *const cred,
>  
>  	for (layer_level = domain->num_layers - 1; layer_level >= 0;
>  	     layer_level--) {
> -		union access_masks_all layer = {
> -			.masks = domain->access_masks[layer_level],
> -		};
> -
> -		if (layer.all & masks_all.all) {
> +		if (access_masks_intersect(domain->access_masks[layer_level],
> +					   masks)) {
>  			if (handle_layer)
>  				*handle_layer = layer_level;
>  
> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
> index 889f4b30301a..9f8b33815c2c 100644
> --- a/security/landlock/ruleset.h
> +++ b/security/landlock/ruleset.h
> @@ -229,18 +229,13 @@ static inline void landlock_get_ruleset(struct landlock_ruleset *const ruleset)
>  static inline struct access_masks
>  landlock_union_access_masks(const struct landlock_ruleset *const domain)
>  {
> -	union access_masks_all matches = {};
> +	struct access_masks matches = {};
>  	size_t layer_level;
>  
> -	for (layer_level = 0; layer_level < domain->num_layers; layer_level++) {
> -		union access_masks_all layer = {
> -			.masks = domain->access_masks[layer_level],
> -		};
> +	for (layer_level = 0; layer_level < domain->num_layers; layer_level++)
> +		access_masks_merge(&matches, domain->access_masks[layer_level]);
>  
> -		matches.all |= layer.all;
> -	}
> -
> -	return matches.masks;
> +	return matches;
>  }
>  
>  static inline void
> -- 
> 2.53.0
> 
> 

^ permalink raw reply related

* Re: [PATCH v3 1/2] lsm: add backing_file LSM hooks
From: Amir Goldstein @ 2026-03-30  8:35 UTC (permalink / raw)
  To: Paul Moore
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Gao Xiang, Christian Brauner
In-Reply-To: <20260327220446.353103-5-paul@paul-moore.com>

[-- Attachment #1: Type: text/plain, Size: 6230 bytes --]

On Fri, Mar 27, 2026 at 11:05 PM Paul Moore <paul@paul-moore.com> wrote:
>
> Stacked filesystems such as overlayfs do not currently provide the
> necessary mechanisms for LSMs to properly enforce access controls on the
> mmap() and mprotect() operations.  In order to resolve this gap, a LSM
> security blob is being added to the backing_file struct and the following
> new LSM hooks are being created:
>
>  security_backing_file_alloc()
>  security_backing_file_free()
>  security_mmap_backing_file()
>
> The first two hooks are to manage the lifecycle of the LSM security blob
> in the backing_file struct, while the third provides a new mmap() access
> control point for the underlying backing file.  It is also expected that
> LSMs will likely want to update their security_file_mprotect() callback
> to address issues with their mprotect() controls, but that does not
> require a change to the security_file_mprotect() LSM hook.
>
> There are a two other small changes to support these new LSM hooks.  We
> pass the user file associated with a backing file down to
> alloc_empty_backing_file() so it can be included in the
> security_backing_file_alloc() hook, and we constify the file struct field
> in the LSM common_audit_data struct to better support LSMs that need to
> pass a const file struct pointer into the common LSM audit code.
>
> Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL()
> and supplying a fixup.
>
> Cc: stable@vger.kernel.org
> Acked-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Paul Moore <paul@paul-moore.com>
> ---
>  fs/backing-file.c             |  18 ++++--
>  fs/erofs/ishare.c             |  10 +++-
>  fs/file_table.c               |  21 ++++++-
>  fs/fuse/passthrough.c         |   2 +-
>  fs/internal.h                 |   3 +-
>  fs/overlayfs/dir.c            |   2 +-
>  fs/overlayfs/file.c           |   2 +-
>  include/linux/backing-file.h  |   4 +-
>  include/linux/fs.h            |   1 +
>  include/linux/lsm_audit.h     |   2 +-
>  include/linux/lsm_hook_defs.h |   5 ++
>  include/linux/lsm_hooks.h     |   1 +
>  include/linux/security.h      |  22 ++++++++
>  security/lsm.h                |   1 +
>  security/lsm_init.c           |   9 +++
>  security/security.c           | 100 ++++++++++++++++++++++++++++++++++
>  16 files changed, 187 insertions(+), 16 deletions(-)
>

That looks like a nice clean abstraction to me.

...

> diff --git a/fs/file_table.c b/fs/file_table.c
> index aaa5faaace1e..0bdc26cae138 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -50,6 +50,7 @@ struct backing_file {
>                 struct path user_path;
>                 freeptr_t bf_freeptr;
>         };
> +       void *security;

This needs ifdef SECURITY
and the name should be user_security

>  };
>
>  #define backing_file(f) container_of(f, struct backing_file, file)
> @@ -66,6 +67,11 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
>  }
>  EXPORT_SYMBOL_GPL(backing_file_set_user_path);
>
> +void *backing_file_security(const struct file *f)
> +{
> +       return backing_file(f)->security;
> +}

I prefer the name backing_file_user_security()

Terminology here is very confusing but when saying
"backing file" it is more natural that one is referring to the
backing xfs file with overlayfs has opened.

The "backing file" already has an LSM blob f->f_security
which is fair the call it the "backing file's LSM blob"

Therefore, I think we need to make a distinction, as we did
with backing_file_user_path() and refer to this as something along
the lines of the "backing file's user LSM blob".

> +
>  static inline void file_free(struct file *f)
>  {
>         security_file_free(f);
> @@ -73,8 +79,11 @@ static inline void file_free(struct file *f)
>                 percpu_counter_dec(&nr_files);
>         put_cred(f->f_cred);
>         if (unlikely(f->f_mode & FMODE_BACKING)) {
> -               path_put(backing_file_user_path(f));
> -               kmem_cache_free(bfilp_cachep, backing_file(f));
> +               struct backing_file *ff = backing_file(f);
> +
> +               security_backing_file_free(&ff->security);
> +               path_put(&ff->user_path);
> +               kmem_cache_free(bfilp_cachep, ff);

Not directly related to your patch, but as this is growing, IMO
this would look cleaner with backing_file_free() inline helper
(see attached path).

>         } else {
>                 kmem_cache_free(filp_cachep, f);
>         }
> @@ -290,7 +299,8 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
>   * This is only for kernel internal use, and the allocate file must not be
>   * installed into file tables or such.
>   */
> -struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
> +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> +                                     const struct file *user_file)
>  {
>         struct backing_file *ff;
>         int error;
> @@ -306,6 +316,11 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
>         }
>
>         ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
> +       error = security_backing_file_alloc(&ff->security, user_file);
> +       if (unlikely(error)) {
> +               fput(&ff->file);
> +               return ERR_PTR(error);
> +       }
>         return &ff->file;
>  }

There is an API issue here.
in order to call fput() we must ensure that user_security was initialized to
NULL (or allocated).

I don't think that we want security_backing_file_alloc() to provide this
semantic and the current patch does not implement it.

Furthermore, user_path is also not initialized in the error case.

Attached UNTESTED fixup patch to suggest a cleanup with
init_backing_file() helper.

It also changes the variable and helper name to user_security
and plays some trick to avoid many ifdef SECURITY.
Feel free to take whichever bits you like with/without attribution.

If you prefer, attached also a proper prep patch.
compile tested only.

Thanks,
Amir.

[-- Attachment #2: 0001-backing_file_user_security.patch --]
[-- Type: text/x-patch, Size: 3765 bytes --]

From 4858f610d960454ab4de0f29f3557016e80848bd Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il@gmail.com>
Date: Mon, 30 Mar 2026 08:26:01 +0200
Subject: [PATCH] backing_file_user_security

---
 fs/file_table.c    | 47 ++++++++++++++++++++++++++++++++++++----------
 include/linux/fs.h |  2 +-
 2 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 0bdc26cae1389..4666e88ba687d 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -43,14 +43,19 @@ static struct kmem_cache *bfilp_cachep __ro_after_init;
 
 static struct percpu_counter nr_files __cacheline_aligned_in_smp;
 
-/* Container for backing file with optional user path */
+/* Container for backing file with optional user path and security blob */
 struct backing_file {
 	struct file file;
 	union {
 		struct path user_path;
 		freeptr_t bf_freeptr;
+		void *dummy_security;
 	};
-	void *security;
+#ifdef CONFIG_SECURITY
+	void *user_security;
+#else
+#define user_security dummy_security
+#endif
 };
 
 #define backing_file(f) container_of(f, struct backing_file, file)
@@ -67,9 +72,16 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
 }
 EXPORT_SYMBOL_GPL(backing_file_set_user_path);
 
-void *backing_file_security(const struct file *f)
+void *backing_file_user_security(struct file *f)
 {
-	return backing_file(f)->security;
+	return backing_file(f)->user_security;
+}
+
+static inline void backing_file_free(struct backing_file *ff)
+{
+	security_backing_file_free(&ff->user_security);
+	path_put(&ff->user_path);
+	kmem_cache_free(bfilp_cachep, ff);
 }
 
 static inline void file_free(struct file *f)
@@ -79,11 +91,7 @@ static inline void file_free(struct file *f)
 		percpu_counter_dec(&nr_files);
 	put_cred(f->f_cred);
 	if (unlikely(f->f_mode & FMODE_BACKING)) {
-		struct backing_file *ff = backing_file(f);
-
-		security_backing_file_free(&ff->security);
-		path_put(&ff->user_path);
-		kmem_cache_free(bfilp_cachep, ff);
+		backing_file_free(backing_file(f));
 	} else {
 		kmem_cache_free(filp_cachep, f);
 	}
@@ -292,6 +300,23 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
 	return f;
 }
 
+static int init_backing_file(struct backing_file *ff,
+			     const struct file *user_file)
+{
+	int error;
+
+	memset(&ff->user_path, 0, sizeof(ff->user_path));
+	ff->user_security = NULL;
+
+	error = security_backing_file_alloc(&ff->user_security, user_file);
+	if (unlikely(error)) {
+		fput(&ff->file);
+		return ERR_PTR(error);
+	}
+
+	return 0;
+}
+
 /*
  * Variant of alloc_empty_file() that allocates a backing_file container
  * and doesn't check and modify nr_files.
@@ -315,12 +340,14 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
 		return ERR_PTR(error);
 	}
 
+	// The f_mode flags must be set before fput()
 	ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
-	error = security_backing_file_alloc(&ff->security, user_file);
+	error = init_backing_file(ff, user_file);
 	if (unlikely(error)) {
 		fput(&ff->file);
 		return ERR_PTR(error);
 	}
+
 	return &ff->file;
 }
 EXPORT_SYMBOL_GPL(alloc_empty_backing_file);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8f5702cfb5e0b..60450a0790add 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2474,7 +2474,7 @@ struct file *dentry_open_nonotify(const struct path *path, int flags,
 struct file *dentry_create(struct path *path, int flags, umode_t mode,
 			   const struct cred *cred);
 const struct path *backing_file_user_path(const struct file *f);
-void *backing_file_security(const struct file *f);
+void *backing_file_user_security(const struct file *f);
 
 /*
  * When mmapping a file on a stackable filesystem (e.g., overlayfs), the file
-- 
2.53.0


[-- Attachment #3: 0001-fs-prepare-for-adding-user_security-block-to-backing.patch --]
[-- Type: text/x-patch, Size: 2128 bytes --]

From cad1df280bcc935289c787f5f4deb4a23ea20fcd Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il@gmail.com>
Date: Mon, 30 Mar 2026 10:27:51 +0200
Subject: [PATCH] fs: prepare for adding user_security block to backing_file

In preparation to adding user_security blob to backing_file struct,
factor out helpers init_backing_file() and backing_file_free().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/file_table.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index aaa5faaace1e9..9e02c1f16db3c 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -66,6 +66,12 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
 }
 EXPORT_SYMBOL_GPL(backing_file_set_user_path);
 
+static inline void backing_file_free(struct backing_file *ff)
+{
+	path_put(&ff->user_path);
+	kmem_cache_free(bfilp_cachep, ff);
+}
+
 static inline void file_free(struct file *f)
 {
 	security_file_free(f);
@@ -73,8 +79,7 @@ static inline void file_free(struct file *f)
 		percpu_counter_dec(&nr_files);
 	put_cred(f->f_cred);
 	if (unlikely(f->f_mode & FMODE_BACKING)) {
-		path_put(backing_file_user_path(f));
-		kmem_cache_free(bfilp_cachep, backing_file(f));
+		backing_file_free(backing_file(f));
 	} else {
 		kmem_cache_free(filp_cachep, f);
 	}
@@ -283,6 +288,12 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
 	return f;
 }
 
+static int init_backing_file(struct backing_file *ff)
+{
+	memset(&ff->user_path, 0, sizeof(ff->user_path));
+	return 0;
+}
+
 /*
  * Variant of alloc_empty_file() that allocates a backing_file container
  * and doesn't check and modify nr_files.
@@ -305,7 +316,14 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
 		return ERR_PTR(error);
 	}
 
+	// The f_mode flags must be set before fput()
 	ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
+	error = init_backing_file(ff);
+	if (unlikely(error)) {
+		fput(&ff->file);
+		return ERR_PTR(error);
+	}
+
 	return &ff->file;
 }
 EXPORT_SYMBOL_GPL(alloc_empty_backing_file);
-- 
2.53.0


^ permalink raw reply related

* Re: LSM namespacing API
From: Paul Moore @ 2026-03-30  0:56 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Stephen Smalley, Ondrej Mosnacek, linux-security-module, selinux,
	John Johansen
In-Reply-To: <aclOtS61nbG5Wf3p@wind.enjellic.com>

On Sun, Mar 29, 2026 at 12:09 PM Dr. Greg <greg@enjellic.com> wrote:
> On Tue, Mar 24, 2026 at 05:31:09PM -0400, Paul Moore wrote:
> > On Tue, Mar 3, 2026 at 11:46???AM Paul Moore <paul@paul-moore.com> wrote:
> > >
> > > I'd really like to hear from some of the other LSMs before we start
> > > diving into the code.  It may sound funny, but from my perspective
> > > doing the work to get the API definition "right" is far more important
> > > than implementing it.
>
> > It's been three weeks now, and I haven't seen any strong arguments for
> > supporting the clone() API at this time, so we can leave that out for
> > now and stick with just the unshare() API for an initial attempt.  We
> > can always add a clone() API at a later date if needed; going small
> > and expanding over time is usually a better decision anyway.
> >
> > So to quickly summarize, here is where I think the discussion landed:
> >
> > * Implement the lsm_unshare() syscall
> >
> > I expect it would look something like 'lsm_unshare(struct lsm_ctx
> > *ctx, u32 size, u32 flags)' with @ctx specifying the particular LSM
> > being unshared, and @flags being 0/unused at this point in time
> > (unless we can think of something we want to specify here).  Like
> > lsm_set_self_attr(), only one @ctx can be specified at a time, so you
> > can only unshare one LSM at a time.
>
> Unless we miss something, it would seem that there needs to be
> additional thought as to how a process moves, atomically, from one
> effective security configuration to the next.
>
> At a minimum, if we restrict ourselves to the model of simply changing
> the namespace for a single LSM, there would seem to be a need to have
> a 2-step process in order to atomically transition from one security
> model/policy to the next.

That depends on the individual LSMs, they are free to interpret the
unshare request and handle it however they like.

> The interim between the first and second steps would allow an
> orchestrator to configure the new namespace and load new namespace
> specific policy into the security namespace ...

As discussed previously, the LSM policy load syscalls might include
some LSM namespace options. However, I first want to focus on
finalizing the most basic namespace API, which on Linux is arguably
the unshare() syscall concept.

> It would seem that the flags variable might be a good option to use to
> handle this 2-stage transition, for example LSM_NS_INIT and
> LSM_NS_CHANGE, respectively, to specify the initialization and
> execution phases of the transition.

No.  The lsm_unshare() syscall is intended to mimic the existing
unshare() syscall as a single step process from a user's perspective.
If it returns successfully the caller will be in a new LSM namespace
as defined by the individual LSM specified in the syscall.

> The other unanswered issue, or perhaps we missed it, are the security
> controls that should be associated with the unshare call.

Each LSM is free to implement whatever access controls it deems
necessary in its lsm_unshare() callback.

> Will there be a new LSM hook that allows other LSM's to veto the
> creation of a namespace either for itself or for another LSM?

I would expect the lsm_unshare() syscall to operate similarly to the
lsm_set_self_attr() syscall in this regard.

> Is there a need to have yet another kernel command-line parameter that
> would completely deny the ability to create security namespaces?

No, at least not at this point in time.

Individual LSMs can decide how they want to gate their own namespace
functionality, if they implement namespaces at all.

> Is CAP_MAC_ADMIN appropriate as the required capability to create a
> new namespace or does there need to be, for security rigor, a specific
> capability (CAP_LSM_NS?) that gates the ability to execute whatever
> form of the system call is adopted?

Once again, this is up to the individual LSMs, not the framework layer.

> Should there be an option to completely compile LSM namespaces out of
> the kernel?

That doesn't belong in the LSM framework layer, that is up to the
individual LSMs.

> > * Implement /proc/pid/ns/lsm and setns(CLONE_NEWLSM)
> >
> > As discussed previously, this allows us to move a process into an
> > existing, established LSM namespace set.  The caller cannot
> > selectively choose which individual LSM namespaces they join from the
> > given LSM namespace set, they receive the same LSM namespace
> > configuration as the target process.
>
> As an initial aside.  It would be assumed that a positive result of a
> setns call would be to cause the calling process to atomically change
> its security namespace set.  This would further suggest the need to
> have the security namespace creation process also execute atomically
> in a multi-LSM namespace change environment.

In the setns case no new LSM namespaces should be created, the process
simply joins an existing set of LSM namespaces.

> ... That is the concept of whether or not a setns
> call, for any resource namespace, should also force a security
> namespace change if the security namespace of the calling process
> differs from that of the target process.

That decision is left to the individual LSMs.

-- 
paul-moore.com

^ permalink raw reply

* Re: LSM namespacing API
From: Dr. Greg @ 2026-03-29 16:09 UTC (permalink / raw)
  To: Paul Moore
  Cc: Stephen Smalley, Ondrej Mosnacek, linux-security-module, selinux,
	John Johansen
In-Reply-To: <CAHC9VhRgi8_gdx0nKwkOws1VD6EFG+bHNTN5Q8YCxZ3HOCu5PQ@mail.gmail.com>

On Tue, Mar 24, 2026 at 05:31:09PM -0400, Paul Moore wrote:

Good afternoon, I hope the weekend has gone well for everyone.

A few comments on the LSM namespace architecture for when the current
overlayfs drama subsides... :-)

> On Tue, Mar 3, 2026 at 11:46???AM Paul Moore <paul@paul-moore.com> wrote:
> >
> > I'd really like to hear from some of the other LSMs before we start
> > diving into the code.  It may sound funny, but from my perspective
> > doing the work to get the API definition "right" is far more important
> > than implementing it.

> It's been three weeks now, and I haven't seen any strong arguments for
> supporting the clone() API at this time, so we can leave that out for
> now and stick with just the unshare() API for an initial attempt.  We
> can always add a clone() API at a later date if needed; going small
> and expanding over time is usually a better decision anyway.
> 
> So to quickly summarize, here is where I think the discussion landed:
> 
> * Implement the lsm_unshare() syscall
> 
> I expect it would look something like 'lsm_unshare(struct lsm_ctx
> *ctx, u32 size, u32 flags)' with @ctx specifying the particular LSM
> being unshared, and @flags being 0/unused at this point in time
> (unless we can think of something we want to specify here).  Like
> lsm_set_self_attr(), only one @ctx can be specified at a time, so you
> can only unshare one LSM at a time.

Unless we miss something, it would seem that there needs to be
additional thought as to how a process moves, atomically, from one
effective security configuration to the next.

At a minimum, if we restrict ourselves to the model of simply changing
the namespace for a single LSM, there would seem to be a need to have
a 2-step process in order to atomically transition from one security
model/policy to the next.

The logical first step would seem to be to signal an LSM that a
namespace change is impending, with the second step being to tell the
LSM to actually execute the transition.

Presumably in the first step, an LSM would allocate an LSM namespace
memory blob for the new security context and it would also seem like a
good place to determine whether or not the namespace change should be
allowed, secondary to an understanding of possible TOCTOU issues.

The interim between the first and second steps would allow an
orchestrator to configure the new namespace and load new namespace
specific policy into the security namespace blob allocated in the
first step.

It would seem that the flags variable might be a good option to use to
handle this 2-stage transition, for example LSM_NS_INIT and
LSM_NS_CHANGE, respectively, to specify the initialization and
execution phases of the transition.

A simple unshare call becomes much more problematic in the face of an
orchestrator that may wish to create a set of new LSM namespaces for a
new process/container environment.  The inability to atomically
activate the entire new representation of the LSM stack would seem to
be problematic.

The other unanswered issue, or perhaps we missed it, are the security
controls that should be associated with the unshare call.

For example:

Will there be a new LSM hook that allows other LSM's to veto the
creation of a namespace either for itself or for another LSM?  We've
mentioned this before, but it would seem logical that the ability to
deny a change in overall system security policy would be something
that the 'lockdown' LSM would want to do.

Is there a need to have yet another kernel command-line parameter that
would completely deny the ability to create security namespaces?

Is CAP_MAC_ADMIN appropriate as the required capability to create a
new namespace or does there need to be, for security rigor, a specific
capability (CAP_LSM_NS?) that gates the ability to execute whatever
form of the system call is adopted?

Should there be an option to completely compile LSM namespaces out of
the kernel?

> * Implement /proc/pid/ns/lsm and setns(CLONE_NEWLSM)
> 
> As discussed previously, this allows us to move a process into an
> existing, established LSM namespace set.  The caller cannot
> selectively choose which individual LSM namespaces they join from the
> given LSM namespace set, they receive the same LSM namespace
> configuration as the target process.

As an initial aside.  It would be assumed that a positive result of a
setns call would be to cause the calling process to atomically change
its security namespace set.  This would further suggest the need to
have the security namespace creation process also execute atomically
in a multi-LSM namespace change environment.

We may be the only group that has significant field experience with
this, but when it comes to LSM security namespaces, there is a larger
security issue at hand.  That is the concept of whether or not a setns
call, for any resource namespace, should also force a security
namespace change if the security namespace of the calling process
differs from that of the target process.

This, of course, runs up against the meme that containers are not a
kernel concept, but it seems safe to assume, for all practical
purposes, that this horse has bolted from the barn.

A gedanken experiment that should be near and dear to participants in
this conversation, Microsoft's Confidential Containers.

The current predicate for 'trust' based architectures is cryptographic
based integrity measurements and attestation.  If a resource
orchestrator has elected to place a container workload in an alternate
integrity namespace, should another process be allowed to enter, for
example the mount namespace of that process, without also entering
the integrity namespace for the process.

That is just the tip of the iceberg on this issue.

> Any comments, corrections, etc.?  If not, if someone wants to send me
> a patch{set} implementing these changes we can merge them into
> lsm/dev-staging until we have a LSM which implements support for the
> new API.

The above issues come from 10 years of experience in dealing with all
of the issues that arise, particularly in production environments,
with security namespaces.

Without solid answers to these issues the community would be remiss in
cementing down any API's, perhaps that is not a challenge with
existence only in staging.

We would be happy to test fire any API's, but if operational sentiment
is that only in-kernel LSM's and experience are relevant, the odds are
that this functionality isn't going to get done right.  The number of
individuals/people with first hand practical experience with these
issues can probably be comfortably enumerated with one hand.

> paul-moore.com

Have a good week.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply

* Re: [PATCH v3 1/2] lsm: add backing_file LSM hooks
From: Paul Moore @ 2026-03-28 16:34 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Gao Xiang, Christian Brauner, Miklos Szeredi
In-Reply-To: <CAOQ4uxjvCYRLcRM-JGgtCPXKRB1agCBacN1tmR7hDR9TLdVS4w@mail.gmail.com>

On Sat, Mar 28, 2026 at 4:29 AM Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, Mar 27, 2026 at 11:05 PM Paul Moore <paul@paul-moore.com> wrote:
> >
> > Stacked filesystems such as overlayfs do not currently provide the
> > necessary mechanisms for LSMs to properly enforce access controls on the
> > mmap() and mprotect() operations.  In order to resolve this gap, a LSM
> > security blob is being added to the backing_file struct and the following
> > new LSM hooks are being created:
> >
> >  security_backing_file_alloc()
> >  security_backing_file_free()
> >  security_mmap_backing_file()
> >
> > The first two hooks are to manage the lifecycle of the LSM security blob
> > in the backing_file struct, while the third provides a new mmap() access
> > control point for the underlying backing file.  It is also expected that
> > LSMs will likely want to update their security_file_mprotect() callback
> > to address issues with their mprotect() controls, but that does not
> > require a change to the security_file_mprotect() LSM hook.
> >
> > There are a two other small changes to support these new LSM hooks.  We
> > pass the user file associated with a backing file down to
> > alloc_empty_backing_file() so it can be included in the
> > security_backing_file_alloc() hook, and we constify the file struct field
> > in the LSM common_audit_data struct to better support LSMs that need to
> > pass a const file struct pointer into the common LSM audit code.
> >
> > Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL()
> > and supplying a fixup.
> >
> > Cc: stable@vger.kernel.org
> > Acked-by: Christian Brauner <brauner@kernel.org>
> > Signed-off-by: Paul Moore <paul@paul-moore.com>
> > ---
>
> I 100% agree with Christian.
> This is much better than my O_PATH file hack

I'm not surprised that both you and Christian prefer this solution, it
moves all the pain of resolving this issue to the individual LSMs.
Just look at how the SELinux code has changed, even trying to pretty
it up as best as possible, it's objectively much uglier now, not to
mention more complicated.

From my perspective the root cause of this issue lies in the
overlayfs/backing-file design, specifically how overlayfs hides
multiple files under a single file so that it can plug into the
existing VFS/userspace paradigm of a single file.  While the design
and abstraction are no doubt very clever things, and for the most part
they "just work", there are definitely some corner cases that require
special handling, e.g. LSM access controls around mprotect().  In my
opinion, the burden of hiding any ugliness associated with this
special handling lies with the subsystem implementing the abstraction,
which is why I was pushing for a solution where the VFS and/or
backing-file layer would provide the user file (or a stand-in) for the
LSMs to use.  Unfortunately, that was not something the overlayfs and
VFS communities were willing to tolerate, so those of us in the LSM
space were left with a terrible choice: accept that the overlayfs/VFS
folks don't care and hack around the shortcomings of overlayfs, or
leave a public vulnerability for an unknown period of time while the
overlayfs/VFS folks argue over a solution, with the a non-trivial
chance that the LSMs would need to hack around the problem anyway.

That's my view of were things were at, and why I begrudgingly took
this approach.  I'm sure you have your own perspecitve, and I'm not
going to be surprised if you view this primarily as a LSM problem;
it's a common viewpoint amongst Linux kernel maintainers not
responsible for a LSM.

> It is also what Miklos had initially suggested.

Perhaps I've lost the mail, but going back to when this issue was
first discovered, I don't see anything from Miklos relating to this in
either my inbox or the mailing lists.

> > diff --git a/fs/file_table.c b/fs/file_table.c
> > index aaa5faaace1e..0bdc26cae138 100644
> > --- a/fs/file_table.c
> > +++ b/fs/file_table.c
> > @@ -50,6 +50,7 @@ struct backing_file {
> >                 struct path user_path;
> >                 freeptr_t bf_freeptr;
> >         };
>
> Shouldn't we wrap this with
> #ifdef CONFIG_SECURITY

Sure, I'll change that.

> > +       void *security;
>
> please initialize it in init_file()

We lack a clean way to access the backing_file struct in init_file().
I placed the security_backing_file_alloc() initializer in
alloc_empty_backing_file() as it was the first place where we could
both allocate the necessary LSM blob and initialize it with the data
from the user file at the same time.

If you want to intialize backing_file->security in init_file(),
init_file() we will need to add a FMODE_BACKING check in init_file and
split the security_backing_file_alloc() hook into two: one in
init_file() to do the allocation and basic init, one in
alloc_empty_backing_file() to capture the necessary user file data.

Do you still prefer to move the backing_file->security initializtion
into init_file()?

> > +void *backing_file_security(const struct file *f)
> > +{
> > +       return backing_file(f)->security;
>
> I think LSM code should be completely responsible for this ptr
> assignment/free so you should export
>
> void **backing_file_security_ptr(const struct file *f)
> {
>        return &backing_file(f)->security;
> }

Doing so would require us to also move the backing_file struct
definition into include/linux/backing-file.h (or similar), which I
tried very hard to avoid as I suspected you would not approve of that.
I figured if you had wanted to expose the struct definition you would
have defined it in backing-file.h as opposed to file_table.c.

Would you like me to move the backing_file struct definition into
include/linux/backing-file.h?

> > @@ -73,8 +79,11 @@ static inline void file_free(struct file *f)
> >                 percpu_counter_dec(&nr_files);
> >         put_cred(f->f_cred);
> >         if (unlikely(f->f_mode & FMODE_BACKING)) {
> > -               path_put(backing_file_user_path(f));
> > -               kmem_cache_free(bfilp_cachep, backing_file(f));
> > +               struct backing_file *ff = backing_file(f);
> > +
> > +               security_backing_file_free(&ff->security);
>
> Why do you need to add this in vfs code?
>
> Can't you do the same in security_file_free(f)?
>         if (unlikely(f->f_mode & FMODE_BACKING))
>                 security_backing_file_free(backing_file_security_ptr(f));

See my comments above regarding the visibility of the backing_file struct.

> > +               path_put(&ff->user_path);
> > +               kmem_cache_free(bfilp_cachep, ff);
> >         } else {
> >                 kmem_cache_free(filp_cachep, f);
> >         }
> > @@ -290,7 +299,8 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
> >   * This is only for kernel internal use, and the allocate file must not be
> >   * installed into file tables or such.
> >   */
> > -struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
> > +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> > +                                     const struct file *user_file)
> >  {
> >         struct backing_file *ff;
> >         int error;
> > @@ -306,6 +316,11 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
> >         }
> >
> >         ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
> > +       error = security_backing_file_alloc(&ff->security, user_file);> +       if (unlikely(error)) {
> > +               fput(&ff->file);
> > +               return ERR_PTR(error);
> > +       }
> >         return &ff->file;
> >  }
> >  EXPORT_SYMBOL_GPL(alloc_empty_backing_file);
>
> Maybe, and I am not sure,
> alloc_empty_backing_file() should call ONLY
>             error = security_backing_file_alloc(&ff->file, user_file);
>
> Instead of security_file_alloc() AND security_backing_file_alloc()
> and security_backing_file_alloc() can make use of
> backing_file_security_ptr() accessor internally?

This is another case of the code being structured so that we don't
need to expose the backing_file struct definition to the LSMs.  If you
would prefer to expose the backing_file struct in include/linux I can
probably make a few additional simplifications to the code.

-- 
paul-moore.com

^ permalink raw reply

* Re: [PATCH v3 1/2] lsm: add backing_file LSM hooks
From: Amir Goldstein @ 2026-03-28  8:29 UTC (permalink / raw)
  To: Paul Moore
  Cc: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs, Gao Xiang, Christian Brauner, Miklos Szeredi
In-Reply-To: <20260327220446.353103-5-paul@paul-moore.com>

On Fri, Mar 27, 2026 at 11:05 PM Paul Moore <paul@paul-moore.com> wrote:
>
> Stacked filesystems such as overlayfs do not currently provide the
> necessary mechanisms for LSMs to properly enforce access controls on the
> mmap() and mprotect() operations.  In order to resolve this gap, a LSM
> security blob is being added to the backing_file struct and the following
> new LSM hooks are being created:
>
>  security_backing_file_alloc()
>  security_backing_file_free()
>  security_mmap_backing_file()
>
> The first two hooks are to manage the lifecycle of the LSM security blob
> in the backing_file struct, while the third provides a new mmap() access
> control point for the underlying backing file.  It is also expected that
> LSMs will likely want to update their security_file_mprotect() callback
> to address issues with their mprotect() controls, but that does not
> require a change to the security_file_mprotect() LSM hook.
>
> There are a two other small changes to support these new LSM hooks.  We
> pass the user file associated with a backing file down to
> alloc_empty_backing_file() so it can be included in the
> security_backing_file_alloc() hook, and we constify the file struct field
> in the LSM common_audit_data struct to better support LSMs that need to
> pass a const file struct pointer into the common LSM audit code.
>
> Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL()
> and supplying a fixup.
>
> Cc: stable@vger.kernel.org
> Acked-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Paul Moore <paul@paul-moore.com>
> ---

I 100% agree with Christian.
This is much better than my O_PATH file hack
It is also what Miklos had initially suggested.

I have a minor suggestion for API change though

>  fs/backing-file.c             |  18 ++++--
>  fs/erofs/ishare.c             |  10 +++-
>  fs/file_table.c               |  21 ++++++-
>  fs/fuse/passthrough.c         |   2 +-
>  fs/internal.h                 |   3 +-
>  fs/overlayfs/dir.c            |   2 +-
>  fs/overlayfs/file.c           |   2 +-
>  include/linux/backing-file.h  |   4 +-
>  include/linux/fs.h            |   1 +
>  include/linux/lsm_audit.h     |   2 +-
>  include/linux/lsm_hook_defs.h |   5 ++
>  include/linux/lsm_hooks.h     |   1 +
>  include/linux/security.h      |  22 ++++++++
>  security/lsm.h                |   1 +
>  security/lsm_init.c           |   9 +++
>  security/security.c           | 100 ++++++++++++++++++++++++++++++++++
>  16 files changed, 187 insertions(+), 16 deletions(-)
>
> diff --git a/fs/backing-file.c b/fs/backing-file.c
> index 45da8600d564..1f3bbfc75882 100644
> --- a/fs/backing-file.c
> +++ b/fs/backing-file.c
> @@ -12,6 +12,7 @@
>  #include <linux/backing-file.h>
>  #include <linux/splice.h>
>  #include <linux/mm.h>
> +#include <linux/security.h>
>
>  #include "internal.h"
>
> @@ -29,14 +30,15 @@
>   * returned file into a container structure that also stores the stacked
>   * file's path, which can be retrieved using backing_file_user_path().
>   */
> -struct file *backing_file_open(const struct path *user_path, int flags,
> +struct file *backing_file_open(const struct file *user_file, int flags,
>                                const struct path *real_path,
>                                const struct cred *cred)
>  {
> +       const struct path *user_path = &user_file->f_path;
>         struct file *f;
>         int error;
>
> -       f = alloc_empty_backing_file(flags, cred);
> +       f = alloc_empty_backing_file(flags, cred, user_file);
>         if (IS_ERR(f))
>                 return f;
>
> @@ -52,15 +54,16 @@ struct file *backing_file_open(const struct path *user_path, int flags,
>  }
>  EXPORT_SYMBOL_GPL(backing_file_open);
>
> -struct file *backing_tmpfile_open(const struct path *user_path, int flags,
> +struct file *backing_tmpfile_open(const struct file *user_file, int flags,
>                                   const struct path *real_parentpath,
>                                   umode_t mode, const struct cred *cred)
>  {
>         struct mnt_idmap *real_idmap = mnt_idmap(real_parentpath->mnt);
> +       const struct path *user_path = &user_file->f_path;
>         struct file *f;
>         int error;
>
> -       f = alloc_empty_backing_file(flags, cred);
> +       f = alloc_empty_backing_file(flags, cred, user_file);
>         if (IS_ERR(f))
>                 return f;
>
> @@ -336,8 +339,13 @@ int backing_file_mmap(struct file *file, struct vm_area_struct *vma,
>
>         vma_set_file(vma, file);
>
> -       scoped_with_creds(ctx->cred)
> +       scoped_with_creds(ctx->cred) {
> +               ret = security_mmap_backing_file(vma, file, user_file);
> +               if (ret)
> +                       return ret;
> +
>                 ret = vfs_mmap(vma->vm_file, vma);
> +       }
>
>         if (ctx->accessed)
>                 ctx->accessed(user_file);
> diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
> index 829d50d5c717..ec3fc5ac1a55 100644
> --- a/fs/erofs/ishare.c
> +++ b/fs/erofs/ishare.c
> @@ -4,6 +4,7 @@
>   */
>  #include <linux/xxhash.h>
>  #include <linux/mount.h>
> +#include <linux/security.h>
>  #include "internal.h"
>  #include "xattr.h"
>
> @@ -106,7 +107,8 @@ static int erofs_ishare_file_open(struct inode *inode, struct file *file)
>
>         if (file->f_flags & O_DIRECT)
>                 return -EINVAL;
> -       realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred());
> +       realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred(),
> +                                           file);
>         if (IS_ERR(realfile))
>                 return PTR_ERR(realfile);
>         ihold(sharedinode);
> @@ -150,8 +152,14 @@ static ssize_t erofs_ishare_file_read_iter(struct kiocb *iocb,
>  static int erofs_ishare_mmap(struct file *file, struct vm_area_struct *vma)
>  {
>         struct file *realfile = file->private_data;
> +       int err;
>
>         vma_set_file(vma, realfile);
> +
> +       err = security_mmap_backing_file(vma, realfile, file);
> +       if (err)
> +               return err;
> +
>         return generic_file_readonly_mmap(file, vma);
>  }
>
> diff --git a/fs/file_table.c b/fs/file_table.c
> index aaa5faaace1e..0bdc26cae138 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -50,6 +50,7 @@ struct backing_file {
>                 struct path user_path;
>                 freeptr_t bf_freeptr;
>         };

Shouldn't we wrap this with
#ifdef CONFIG_SECURITY

> +       void *security;

please initialize it in init_file()

>  };
>
>  #define backing_file(f) container_of(f, struct backing_file, file)
> @@ -66,6 +67,11 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
>  }
>  EXPORT_SYMBOL_GPL(backing_file_set_user_path);
>
> +void *backing_file_security(const struct file *f)
> +{
> +       return backing_file(f)->security;

I think LSM code should be completely responsible for this ptr
assignment/free so you should export

void **backing_file_security_ptr(const struct file *f)
{
       return &backing_file(f)->security;
}

> +
>  static inline void file_free(struct file *f)
>  {
>         security_file_free(f);
> @@ -73,8 +79,11 @@ static inline void file_free(struct file *f)
>                 percpu_counter_dec(&nr_files);
>         put_cred(f->f_cred);
>         if (unlikely(f->f_mode & FMODE_BACKING)) {
> -               path_put(backing_file_user_path(f));
> -               kmem_cache_free(bfilp_cachep, backing_file(f));
> +               struct backing_file *ff = backing_file(f);
> +
> +               security_backing_file_free(&ff->security);

Why do you need to add this in vfs code?

Can't you do the same in security_file_free(f)?
        if (unlikely(f->f_mode & FMODE_BACKING))
                security_backing_file_free(backing_file_security_ptr(f));


> +               path_put(&ff->user_path);
> +               kmem_cache_free(bfilp_cachep, ff);
>         } else {
>                 kmem_cache_free(filp_cachep, f);
>         }
> @@ -290,7 +299,8 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
>   * This is only for kernel internal use, and the allocate file must not be
>   * installed into file tables or such.
>   */
> -struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
> +struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
> +                                     const struct file *user_file)
>  {
>         struct backing_file *ff;
>         int error;
> @@ -306,6 +316,11 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
>         }
>
>         ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
> +       error = security_backing_file_alloc(&ff->security, user_file);> +       if (unlikely(error)) {
> +               fput(&ff->file);
> +               return ERR_PTR(error);
> +       }
>         return &ff->file;
>  }
>  EXPORT_SYMBOL_GPL(alloc_empty_backing_file);

Maybe, and I am not sure,
alloc_empty_backing_file() should call ONLY
            error = security_backing_file_alloc(&ff->file, user_file);

Instead of security_file_alloc() AND security_backing_file_alloc()
and security_backing_file_alloc() can make use of
backing_file_security_ptr() accessor internally?

I think this will further abstract LSM implementation details from vfs
and avoid the need to spray #ifdef SECURITY in vfs code.

WDYT?

Thanks for following through with this elegant and clean API!
Amir.

^ permalink raw reply

* Re: [PATCH v3 6/9] security: Hornet LSM
From: kernel test robot @ 2026-03-28  2:55 UTC (permalink / raw)
  To: Blaise Boscaccy, Jonathan Corbet, Paul Moore, James Morris,
	Serge E. Hallyn, Mickaël Salaün, Günther Noack,
	Dr. David Alan Gilbert, Andrew Morton, James.Bottomley, dhowells,
	Fan Wu, Ryan Foster, Randy Dunlap, linux-security-module,
	linux-doc, linux-kernel, bpf
  Cc: oe-kbuild-all, Linux Memory Management List
In-Reply-To: <20260326060655.2550595-7-bboscaccy@linux.microsoft.com>

Hi Blaise,

kernel test robot noticed the following build errors:

[auto build test ERROR on herbert-cryptodev-2.6/master]
[also build test ERROR on herbert-crypto-2.6/master shuah-kselftest/next shuah-kselftest/fixes linus/master v7.0-rc5 next-20260327]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Blaise-Boscaccy/crypto-pkcs7-add-flag-for-validated-trust-on-a-signed-info-block/20260327-145024
base:   https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
patch link:    https://lore.kernel.org/r/20260326060655.2550595-7-bboscaccy%40linux.microsoft.com
patch subject: [PATCH v3 6/9] security: Hornet LSM
config: x86_64-randconfig-102-20260328 (https://download.01.org/0day-ci/archive/20260328/202603281030.AIoqyOy3-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260328/202603281030.AIoqyOy3-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603281030.AIoqyOy3-lkp@intel.com/

All errors (new ones prefixed by >>):

>> security/hornet/hornet_lsm.c:194:6: error: call to undeclared function 'verify_pkcs7_message_sig'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     194 |         if (verify_pkcs7_message_sig(prog->insnsi, prog->len * sizeof(struct bpf_insn), msg,
         |             ^
   1 error generated.


vim +/verify_pkcs7_message_sig +194 security/hornet/hornet_lsm.c

   155	
   156	static int hornet_check_program(struct bpf_prog *prog, union bpf_attr *attr,
   157					struct bpf_token *token, bool is_kernel,
   158					enum lsm_integrity_verdict *verdict)
   159	{
   160		struct hornet_maps maps = {0};
   161		bpfptr_t usig = make_bpfptr(attr->signature, is_kernel);
   162		struct pkcs7_message *msg;
   163		struct hornet_parse_context *ctx;
   164		void *sig;
   165		int err;
   166		const void *authattrs;
   167		size_t authattrs_len;
   168	
   169		if (!attr->signature) {
   170			*verdict = LSM_INT_VERDICT_UNSIGNED;
   171			return 0;
   172		}
   173	
   174		ctx = kzalloc(sizeof(struct hornet_parse_context), GFP_KERNEL);
   175		if (!ctx)
   176			return -ENOMEM;
   177	
   178		maps.fd_array = make_bpfptr(attr->fd_array, is_kernel);
   179		sig = kzalloc(attr->signature_size, GFP_KERNEL);
   180		if (!sig) {
   181			err = -ENOMEM;
   182			goto out;
   183		}
   184		err = copy_from_bpfptr(sig, usig, attr->signature_size);
   185		if (err != 0)
   186			goto cleanup_sig;
   187	
   188		msg = pkcs7_parse_message(sig, attr->signature_size);
   189		if (IS_ERR(msg)) {
   190			err = LSM_INT_VERDICT_BADSIG;
   191			goto cleanup_sig;
   192		}
   193	
 > 194		if (verify_pkcs7_message_sig(prog->insnsi, prog->len * sizeof(struct bpf_insn), msg,
   195					     VERIFY_USE_SECONDARY_KEYRING,
   196					     VERIFYING_BPF_SIGNATURE,
   197					     NULL, NULL)) {
   198			err = LSM_INT_VERDICT_UNKNOWNKEY;
   199			goto cleanup_msg;
   200		}
   201	
   202		if (pkcs7_get_authattr(msg, OID_hornet_data,
   203				       &authattrs, &authattrs_len) == -ENODATA) {
   204			err = LSM_INT_VERDICT_PARTIALSIG;
   205			goto cleanup_msg;
   206		}
   207	
   208		err = asn1_ber_decoder(&hornet_decoder, ctx, authattrs, authattrs_len);
   209		if (err < 0 || authattrs == NULL) {
   210			err = LSM_INT_VERDICT_BADSIG;
   211			goto cleanup_msg;
   212		}
   213	
   214		err = hornet_verify_hashes(&maps, ctx, prog);
   215	
   216	cleanup_msg:
   217		pkcs7_free_message(msg);
   218	cleanup_sig:
   219		kfree(sig);
   220	out:
   221		kfree(ctx);
   222		return err;
   223	}
   224	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* [PATCH v3 2/2] selinux: fix overlayfs mmap() and mprotect() access checks
From: Paul Moore @ 2026-03-27 22:04 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs
  Cc: Amir Goldstein, Gao Xiang, Christian Brauner
In-Reply-To: <20260327220446.353103-4-paul@paul-moore.com>

The existing SELinux security model for overlayfs is to allow access if
the current task is able to access the top level file (the "user" file)
and the mounter's credentials are sufficient to access the lower
level file (the "backing" file).  Unfortunately, the current code does
not properly enforce these access controls for both mmap() and mprotect()
operations on overlayfs filesystems.

This patch makes use of the newly created security_mmap_backing_file()
LSM hook to provide the missing backing file enforcement for mmap()
operations, and leverages the backing file API and new LSM blob to
provide the necessary information to properly enforce the mprotect()
access controls.

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <paul@paul-moore.com>
---
 security/selinux/hooks.c          | 256 +++++++++++++++++++++---------
 security/selinux/include/objsec.h |  17 ++
 2 files changed, 202 insertions(+), 71 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index d8224ea113d1..d8557da79480 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1745,6 +1745,60 @@ static inline int file_path_has_perm(const struct cred *cred,
 static int bpf_fd_pass(const struct file *file, u32 sid);
 #endif
 
+static int __file_has_perm(const struct cred *cred, const struct file *file,
+			   u32 av, bool bf_user_file)
+
+{
+	struct common_audit_data ad;
+	struct inode *inode;
+	u32 ssid = cred_sid(cred);
+	u32 tsid_fd;
+	int rc;
+
+	if (bf_user_file) {
+		struct backing_file_security_struct *bfsec;
+		const struct path *path;
+
+		if (WARN_ON(!(file->f_mode & FMODE_BACKING)))
+			return -EIO;
+
+		bfsec = selinux_backing_file(file);
+		path = backing_file_user_path(file);
+		tsid_fd = bfsec->uf_sid;
+		inode = d_inode(path->dentry);
+
+		ad.type = LSM_AUDIT_DATA_PATH;
+		ad.u.path = *path;
+	} else {
+		struct file_security_struct *fsec = selinux_file(file);
+
+		tsid_fd = fsec->sid;
+		inode = file_inode(file);
+
+		ad.type = LSM_AUDIT_DATA_FILE;
+		ad.u.file = file;
+	}
+
+	if (ssid != tsid_fd) {
+		rc = avc_has_perm(ssid, tsid_fd, SECCLASS_FD, FD__USE, &ad);
+		if (rc)
+			return rc;
+	}
+
+#ifdef CONFIG_BPF_SYSCALL
+	/* regardless of backing vs user file, use the underlying file here */
+	rc = bpf_fd_pass(file, ssid);
+	if (rc)
+		return rc;
+#endif
+
+	/* av is zero if only checking access to the descriptor. */
+	if (av)
+		return inode_has_perm(cred, inode, av, &ad);
+
+	return 0;
+}
+
 /* Check whether a task can use an open file descriptor to
    access an inode in a given way.  Check access to the
    descriptor itself, and then use dentry_has_perm to
@@ -1753,41 +1807,10 @@ static int bpf_fd_pass(const struct file *file, u32 sid);
    has the same SID as the process.  If av is zero, then
    access to the file is not checked, e.g. for cases
    where only the descriptor is affected like seek. */
-static int file_has_perm(const struct cred *cred,
-			 struct file *file,
-			 u32 av)
+static inline int file_has_perm(const struct cred *cred,
+				const struct file *file, u32 av)
 {
-	struct file_security_struct *fsec = selinux_file(file);
-	struct inode *inode = file_inode(file);
-	struct common_audit_data ad;
-	u32 sid = cred_sid(cred);
-	int rc;
-
-	ad.type = LSM_AUDIT_DATA_FILE;
-	ad.u.file = file;
-
-	if (sid != fsec->sid) {
-		rc = avc_has_perm(sid, fsec->sid,
-				  SECCLASS_FD,
-				  FD__USE,
-				  &ad);
-		if (rc)
-			goto out;
-	}
-
-#ifdef CONFIG_BPF_SYSCALL
-	rc = bpf_fd_pass(file, cred_sid(cred));
-	if (rc)
-		return rc;
-#endif
-
-	/* av is zero if only checking access to the descriptor. */
-	rc = 0;
-	if (av)
-		rc = inode_has_perm(cred, inode, av, &ad);
-
-out:
-	return rc;
+	return __file_has_perm(cred, file, av, false);
 }
 
 /*
@@ -3825,6 +3848,17 @@ static int selinux_file_alloc_security(struct file *file)
 	return 0;
 }
 
+static int selinux_backing_file_alloc(void *backing_file_blob,
+				      const struct file *user_file)
+{
+	struct backing_file_security_struct *bfsec;
+
+	bfsec = selinux_backing_file_raw(backing_file_blob);
+	bfsec->uf_sid = selinux_file(user_file)->sid;
+
+	return 0;
+}
+
 /*
  * Check whether a task has the ioctl permission and cmd
  * operation to an inode.
@@ -3942,42 +3976,55 @@ static int selinux_file_ioctl_compat(struct file *file, unsigned int cmd,
 
 static int default_noexec __ro_after_init;
 
-static int file_map_prot_check(struct file *file, unsigned long prot, int shared)
+static int __file_map_prot_check(const struct cred *cred,
+				 const struct file *file, unsigned long prot,
+				 bool shared, bool bf_user_file)
 {
-	const struct cred *cred = current_cred();
-	u32 sid = cred_sid(cred);
-	int rc = 0;
+	struct inode *inode = NULL;
+	bool prot_exec = prot & PROT_EXEC;
+	bool prot_write = prot & PROT_WRITE;
+
+	if (file) {
+		if (bf_user_file)
+			inode = d_inode(backing_file_user_path(file)->dentry);
+		else
+			inode = file_inode(file);
+	}
+
+	if (default_noexec && prot_exec &&
+	    (!file || IS_PRIVATE(inode) || (!shared && prot_write))) {
+		int rc;
+		u32 sid = cred_sid(cred);
 
-	if (default_noexec &&
-	    (prot & PROT_EXEC) && (!file || IS_PRIVATE(file_inode(file)) ||
-				   (!shared && (prot & PROT_WRITE)))) {
 		/*
-		 * We are making executable an anonymous mapping or a
-		 * private file mapping that will also be writable.
-		 * This has an additional check.
+		 * We are making executable an anonymous mapping or a private
+		 * file mapping that will also be writable.
 		 */
-		rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
-				  PROCESS__EXECMEM, NULL);
+		rc = avc_has_perm(sid, sid, SECCLASS_PROCESS, PROCESS__EXECMEM,
+				  NULL);
 		if (rc)
-			goto error;
+			return rc;
 	}
 
 	if (file) {
-		/* read access is always possible with a mapping */
+		/* "read" always possible, "write" only if shared */
 		u32 av = FILE__READ;
-
-		/* write access only matters if the mapping is shared */
-		if (shared && (prot & PROT_WRITE))
+		if (shared && prot_write)
 			av |= FILE__WRITE;
-
-		if (prot & PROT_EXEC)
+		if (prot_exec)
 			av |= FILE__EXECUTE;
 
-		return file_has_perm(cred, file, av);
+		return __file_has_perm(cred, file, av, bf_user_file);
 	}
 
-error:
-	return rc;
+	return 0;
+}
+
+static inline int file_map_prot_check(const struct cred *cred,
+				      const struct file *file,
+				      unsigned long prot, bool shared)
+{
+	return __file_map_prot_check(cred, file, prot, shared, false);
 }
 
 static int selinux_mmap_addr(unsigned long addr)
@@ -3993,36 +4040,80 @@ static int selinux_mmap_addr(unsigned long addr)
 	return rc;
 }
 
-static int selinux_mmap_file(struct file *file,
-			     unsigned long reqprot __always_unused,
-			     unsigned long prot, unsigned long flags)
+static int selinux_mmap_file_common(const struct cred *cred, struct file *file,
+				    unsigned long prot, bool shared)
 {
-	struct common_audit_data ad;
-	int rc;
-
 	if (file) {
+		int rc;
+		struct common_audit_data ad;
+
 		ad.type = LSM_AUDIT_DATA_FILE;
 		ad.u.file = file;
-		rc = inode_has_perm(current_cred(), file_inode(file),
-				    FILE__MAP, &ad);
+		rc = inode_has_perm(cred, file_inode(file), FILE__MAP, &ad);
 		if (rc)
 			return rc;
 	}
 
-	return file_map_prot_check(file, prot,
-				   (flags & MAP_TYPE) == MAP_SHARED);
+	return file_map_prot_check(cred, file, prot, shared);
+}
+
+static int selinux_mmap_file(struct file *file,
+			     unsigned long reqprot __always_unused,
+			     unsigned long prot, unsigned long flags)
+{
+	return selinux_mmap_file_common(current_cred(), file, prot,
+					(flags & MAP_TYPE) == MAP_SHARED);
+}
+
+/**
+ * selinux_mmap_backing_file - Check mmap permissions on a backing file
+ * @vma: memory region
+ * @backing_file: stacked filesystem backing file
+ * @user_file: user visible file
+ *
+ * This is called after selinux_mmap_file() on stacked filesystems, and it
+ * is this function's responsibility to verify access to @backing_file and
+ * setup the SELinux state for possible later use in the mprotect() code path.
+ *
+ * By the time this function is called, mmap() access to @user_file has already
+ * been authorized and @vma->vm_file has been set to point to @backing_file.
+ *
+ * Return zero on success, negative values otherwise.
+ */
+static int selinux_mmap_backing_file(struct vm_area_struct *vma,
+				     struct file *backing_file,
+				     struct file *user_file __always_unused)
+{
+	unsigned long prot = 0;
+
+	/* translate vma->vm_flags perms into PROT perms */
+	if (vma->vm_flags & VM_READ)
+		prot |= PROT_READ;
+	if (vma->vm_flags & VM_WRITE)
+		prot |= PROT_WRITE;
+	if (vma->vm_flags & VM_EXEC)
+		prot |= PROT_EXEC;
+
+	return selinux_mmap_file_common(backing_file->f_cred, backing_file,
+					prot, vma->vm_flags & VM_SHARED);
 }
 
 static int selinux_file_mprotect(struct vm_area_struct *vma,
 				 unsigned long reqprot __always_unused,
 				 unsigned long prot)
 {
+	int rc;
 	const struct cred *cred = current_cred();
 	u32 sid = cred_sid(cred);
+	const struct file *file = vma->vm_file;
+	bool backing_file;
+	bool shared = vma->vm_flags & VM_SHARED;
+
+	/* check if we need to trigger the "backing files are awful" mode */
+	backing_file = file && (file->f_mode & FMODE_BACKING);
 
 	if (default_noexec &&
 	    (prot & PROT_EXEC) && !(vma->vm_flags & VM_EXEC)) {
-		int rc = 0;
 		/*
 		 * We don't use the vma_is_initial_heap() helper as it has
 		 * a history of problems and is currently broken on systems
@@ -4036,11 +4127,15 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 		    vma->vm_end <= vma->vm_mm->brk) {
 			rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECHEAP, NULL);
-		} else if (!vma->vm_file && (vma_is_initial_stack(vma) ||
+			if (rc)
+				return rc;
+		} else if (!file && (vma_is_initial_stack(vma) ||
 			    vma_is_stack_for_current(vma))) {
 			rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECSTACK, NULL);
-		} else if (vma->vm_file && vma->anon_vma) {
+			if (rc)
+				return rc;
+		} else if (file && vma->anon_vma) {
 			/*
 			 * We are making executable a file mapping that has
 			 * had some COW done. Since pages might have been
@@ -4048,13 +4143,29 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 			 * modified content.  This typically should only
 			 * occur for text relocations.
 			 */
-			rc = file_has_perm(cred, vma->vm_file, FILE__EXECMOD);
+			rc = __file_has_perm(cred, file, FILE__EXECMOD,
+					     backing_file);
+			if (rc)
+				return rc;
+			if (backing_file) {
+				rc = file_has_perm(file->f_cred, file,
+						   FILE__EXECMOD);
+				if (rc)
+					return rc;
+			}
 		}
+	}
+
+	rc = __file_map_prot_check(cred, file, prot, shared, backing_file);
+	if (rc)
+		return rc;
+	if (backing_file) {
+		rc = file_map_prot_check(file->f_cred, file, prot, shared);
 		if (rc)
 			return rc;
 	}
 
-	return file_map_prot_check(vma->vm_file, prot, vma->vm_flags&VM_SHARED);
+	return 0;
 }
 
 static int selinux_file_lock(struct file *file, unsigned int cmd)
@@ -7393,6 +7504,7 @@ struct lsm_blob_sizes selinux_blob_sizes __ro_after_init = {
 	.lbs_cred = sizeof(struct cred_security_struct),
 	.lbs_task = sizeof(struct task_security_struct),
 	.lbs_file = sizeof(struct file_security_struct),
+	.lbs_backing_file = sizeof(struct backing_file_security_struct),
 	.lbs_inode = sizeof(struct inode_security_struct),
 	.lbs_ipc = sizeof(struct ipc_security_struct),
 	.lbs_key = sizeof(struct key_security_struct),
@@ -7498,9 +7610,11 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = {
 
 	LSM_HOOK_INIT(file_permission, selinux_file_permission),
 	LSM_HOOK_INIT(file_alloc_security, selinux_file_alloc_security),
+	LSM_HOOK_INIT(backing_file_alloc, selinux_backing_file_alloc),
 	LSM_HOOK_INIT(file_ioctl, selinux_file_ioctl),
 	LSM_HOOK_INIT(file_ioctl_compat, selinux_file_ioctl_compat),
 	LSM_HOOK_INIT(mmap_file, selinux_mmap_file),
+	LSM_HOOK_INIT(mmap_backing_file, selinux_mmap_backing_file),
 	LSM_HOOK_INIT(mmap_addr, selinux_mmap_addr),
 	LSM_HOOK_INIT(file_mprotect, selinux_file_mprotect),
 	LSM_HOOK_INIT(file_lock, selinux_file_lock),
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index 5bddd28ea5cb..8ec493064aa2 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -88,6 +88,10 @@ struct file_security_struct {
 	u32 pseqno; /* Policy seqno at the time of file open */
 };
 
+struct backing_file_security_struct {
+	u32 uf_sid; /* associated user file fsec->sid */
+};
+
 struct superblock_security_struct {
 	u32 sid; /* SID of file system superblock */
 	u32 def_sid; /* default SID for labeling */
@@ -195,6 +199,19 @@ static inline struct file_security_struct *selinux_file(const struct file *file)
 	return file->f_security + selinux_blob_sizes.lbs_file;
 }
 
+static inline struct backing_file_security_struct *
+selinux_backing_file_raw(void *blob)
+{
+	return blob + selinux_blob_sizes.lbs_backing_file;
+}
+
+static inline struct backing_file_security_struct *
+selinux_backing_file(const struct file *backing_file)
+{
+	void *blob = backing_file_security(backing_file);
+	return selinux_backing_file_raw(blob);
+}
+
 static inline struct inode_security_struct *
 selinux_inode(const struct inode *inode)
 {
-- 
2.53.0


^ permalink raw reply related

* [PATCH v3 1/2] lsm: add backing_file LSM hooks
From: Paul Moore @ 2026-03-27 22:04 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs
  Cc: Amir Goldstein, Gao Xiang, Christian Brauner
In-Reply-To: <20260327220446.353103-4-paul@paul-moore.com>

Stacked filesystems such as overlayfs do not currently provide the
necessary mechanisms for LSMs to properly enforce access controls on the
mmap() and mprotect() operations.  In order to resolve this gap, a LSM
security blob is being added to the backing_file struct and the following
new LSM hooks are being created:

 security_backing_file_alloc()
 security_backing_file_free()
 security_mmap_backing_file()

The first two hooks are to manage the lifecycle of the LSM security blob
in the backing_file struct, while the third provides a new mmap() access
control point for the underlying backing file.  It is also expected that
LSMs will likely want to update their security_file_mprotect() callback
to address issues with their mprotect() controls, but that does not
require a change to the security_file_mprotect() LSM hook.

There are a two other small changes to support these new LSM hooks.  We
pass the user file associated with a backing file down to
alloc_empty_backing_file() so it can be included in the
security_backing_file_alloc() hook, and we constify the file struct field
in the LSM common_audit_data struct to better support LSMs that need to
pass a const file struct pointer into the common LSM audit code.

Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL()
and supplying a fixup.

Cc: stable@vger.kernel.org
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
---
 fs/backing-file.c             |  18 ++++--
 fs/erofs/ishare.c             |  10 +++-
 fs/file_table.c               |  21 ++++++-
 fs/fuse/passthrough.c         |   2 +-
 fs/internal.h                 |   3 +-
 fs/overlayfs/dir.c            |   2 +-
 fs/overlayfs/file.c           |   2 +-
 include/linux/backing-file.h  |   4 +-
 include/linux/fs.h            |   1 +
 include/linux/lsm_audit.h     |   2 +-
 include/linux/lsm_hook_defs.h |   5 ++
 include/linux/lsm_hooks.h     |   1 +
 include/linux/security.h      |  22 ++++++++
 security/lsm.h                |   1 +
 security/lsm_init.c           |   9 +++
 security/security.c           | 100 ++++++++++++++++++++++++++++++++++
 16 files changed, 187 insertions(+), 16 deletions(-)

diff --git a/fs/backing-file.c b/fs/backing-file.c
index 45da8600d564..1f3bbfc75882 100644
--- a/fs/backing-file.c
+++ b/fs/backing-file.c
@@ -12,6 +12,7 @@
 #include <linux/backing-file.h>
 #include <linux/splice.h>
 #include <linux/mm.h>
+#include <linux/security.h>
 
 #include "internal.h"
 
@@ -29,14 +30,15 @@
  * returned file into a container structure that also stores the stacked
  * file's path, which can be retrieved using backing_file_user_path().
  */
-struct file *backing_file_open(const struct path *user_path, int flags,
+struct file *backing_file_open(const struct file *user_file, int flags,
 			       const struct path *real_path,
 			       const struct cred *cred)
 {
+	const struct path *user_path = &user_file->f_path;
 	struct file *f;
 	int error;
 
-	f = alloc_empty_backing_file(flags, cred);
+	f = alloc_empty_backing_file(flags, cred, user_file);
 	if (IS_ERR(f))
 		return f;
 
@@ -52,15 +54,16 @@ struct file *backing_file_open(const struct path *user_path, int flags,
 }
 EXPORT_SYMBOL_GPL(backing_file_open);
 
-struct file *backing_tmpfile_open(const struct path *user_path, int flags,
+struct file *backing_tmpfile_open(const struct file *user_file, int flags,
 				  const struct path *real_parentpath,
 				  umode_t mode, const struct cred *cred)
 {
 	struct mnt_idmap *real_idmap = mnt_idmap(real_parentpath->mnt);
+	const struct path *user_path = &user_file->f_path;
 	struct file *f;
 	int error;
 
-	f = alloc_empty_backing_file(flags, cred);
+	f = alloc_empty_backing_file(flags, cred, user_file);
 	if (IS_ERR(f))
 		return f;
 
@@ -336,8 +339,13 @@ int backing_file_mmap(struct file *file, struct vm_area_struct *vma,
 
 	vma_set_file(vma, file);
 
-	scoped_with_creds(ctx->cred)
+	scoped_with_creds(ctx->cred) {
+		ret = security_mmap_backing_file(vma, file, user_file);
+		if (ret)
+			return ret;
+
 		ret = vfs_mmap(vma->vm_file, vma);
+	}
 
 	if (ctx->accessed)
 		ctx->accessed(user_file);
diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
index 829d50d5c717..ec3fc5ac1a55 100644
--- a/fs/erofs/ishare.c
+++ b/fs/erofs/ishare.c
@@ -4,6 +4,7 @@
  */
 #include <linux/xxhash.h>
 #include <linux/mount.h>
+#include <linux/security.h>
 #include "internal.h"
 #include "xattr.h"
 
@@ -106,7 +107,8 @@ static int erofs_ishare_file_open(struct inode *inode, struct file *file)
 
 	if (file->f_flags & O_DIRECT)
 		return -EINVAL;
-	realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred());
+	realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred(),
+					    file);
 	if (IS_ERR(realfile))
 		return PTR_ERR(realfile);
 	ihold(sharedinode);
@@ -150,8 +152,14 @@ static ssize_t erofs_ishare_file_read_iter(struct kiocb *iocb,
 static int erofs_ishare_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct file *realfile = file->private_data;
+	int err;
 
 	vma_set_file(vma, realfile);
+
+	err = security_mmap_backing_file(vma, realfile, file);
+	if (err)
+		return err;
+
 	return generic_file_readonly_mmap(file, vma);
 }
 
diff --git a/fs/file_table.c b/fs/file_table.c
index aaa5faaace1e..0bdc26cae138 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -50,6 +50,7 @@ struct backing_file {
 		struct path user_path;
 		freeptr_t bf_freeptr;
 	};
+	void *security;
 };
 
 #define backing_file(f) container_of(f, struct backing_file, file)
@@ -66,6 +67,11 @@ void backing_file_set_user_path(struct file *f, const struct path *path)
 }
 EXPORT_SYMBOL_GPL(backing_file_set_user_path);
 
+void *backing_file_security(const struct file *f)
+{
+	return backing_file(f)->security;
+}
+
 static inline void file_free(struct file *f)
 {
 	security_file_free(f);
@@ -73,8 +79,11 @@ static inline void file_free(struct file *f)
 		percpu_counter_dec(&nr_files);
 	put_cred(f->f_cred);
 	if (unlikely(f->f_mode & FMODE_BACKING)) {
-		path_put(backing_file_user_path(f));
-		kmem_cache_free(bfilp_cachep, backing_file(f));
+		struct backing_file *ff = backing_file(f);
+
+		security_backing_file_free(&ff->security);
+		path_put(&ff->user_path);
+		kmem_cache_free(bfilp_cachep, ff);
 	} else {
 		kmem_cache_free(filp_cachep, f);
 	}
@@ -290,7 +299,8 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
  * This is only for kernel internal use, and the allocate file must not be
  * installed into file tables or such.
  */
-struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
+struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
+				      const struct file *user_file)
 {
 	struct backing_file *ff;
 	int error;
@@ -306,6 +316,11 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
 	}
 
 	ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
+	error = security_backing_file_alloc(&ff->security, user_file);
+	if (unlikely(error)) {
+		fput(&ff->file);
+		return ERR_PTR(error);
+	}
 	return &ff->file;
 }
 EXPORT_SYMBOL_GPL(alloc_empty_backing_file);
diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c
index 72de97c03d0e..f2d08ac2459b 100644
--- a/fs/fuse/passthrough.c
+++ b/fs/fuse/passthrough.c
@@ -167,7 +167,7 @@ struct fuse_backing *fuse_passthrough_open(struct file *file, int backing_id)
 		goto out;
 
 	/* Allocate backing file per fuse file to store fuse path */
-	backing_file = backing_file_open(&file->f_path, file->f_flags,
+	backing_file = backing_file_open(file, file->f_flags,
 					 &fb->file->f_path, fb->cred);
 	err = PTR_ERR(backing_file);
 	if (IS_ERR(backing_file)) {
diff --git a/fs/internal.h b/fs/internal.h
index cbc384a1aa09..77e90e4124e0 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -106,7 +106,8 @@ extern void chroot_fs_refs(const struct path *, const struct path *);
  */
 struct file *alloc_empty_file(int flags, const struct cred *cred);
 struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred);
-struct file *alloc_empty_backing_file(int flags, const struct cred *cred);
+struct file *alloc_empty_backing_file(int flags, const struct cred *cred,
+				      const struct file *user_file);
 void backing_file_set_user_path(struct file *f, const struct path *path);
 
 static inline void file_put_write_access(struct file *file)
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index ff3dbd1ca61f..f2f20a611af3 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -1374,7 +1374,7 @@ static int ovl_create_tmpfile(struct file *file, struct dentry *dentry,
 				return PTR_ERR(cred);
 
 			ovl_path_upper(dentry->d_parent, &realparentpath);
-			realfile = backing_tmpfile_open(&file->f_path, flags, &realparentpath,
+			realfile = backing_tmpfile_open(file, flags, &realparentpath,
 							mode, current_cred());
 			err = PTR_ERR_OR_ZERO(realfile);
 			pr_debug("tmpfile/open(%pd2, 0%o) = %i\n", realparentpath.dentry, mode, err);
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 97bed2286030..27cc07738f33 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -48,7 +48,7 @@ static struct file *ovl_open_realfile(const struct file *file,
 			if (!inode_owner_or_capable(real_idmap, realinode))
 				flags &= ~O_NOATIME;
 
-			realfile = backing_file_open(file_user_path(file),
+			realfile = backing_file_open(file,
 						     flags, realpath, current_cred());
 		}
 	}
diff --git a/include/linux/backing-file.h b/include/linux/backing-file.h
index 1476a6ed1bfd..c939cd222730 100644
--- a/include/linux/backing-file.h
+++ b/include/linux/backing-file.h
@@ -18,10 +18,10 @@ struct backing_file_ctx {
 	void (*end_write)(struct kiocb *iocb, ssize_t);
 };
 
-struct file *backing_file_open(const struct path *user_path, int flags,
+struct file *backing_file_open(const struct file *user_file, int flags,
 			       const struct path *real_path,
 			       const struct cred *cred);
-struct file *backing_tmpfile_open(const struct path *user_path, int flags,
+struct file *backing_tmpfile_open(const struct file *user_file, int flags,
 				  const struct path *real_parentpath,
 				  umode_t mode, const struct cred *cred);
 ssize_t backing_file_read_iter(struct file *file, struct iov_iter *iter,
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8b3dd145b25e..8f5702cfb5e0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2474,6 +2474,7 @@ struct file *dentry_open_nonotify(const struct path *path, int flags,
 struct file *dentry_create(struct path *path, int flags, umode_t mode,
 			   const struct cred *cred);
 const struct path *backing_file_user_path(const struct file *f);
+void *backing_file_security(const struct file *f);
 
 /*
  * When mmapping a file on a stackable filesystem (e.g., overlayfs), the file
diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
index 382c56a97bba..584db296e43b 100644
--- a/include/linux/lsm_audit.h
+++ b/include/linux/lsm_audit.h
@@ -94,7 +94,7 @@ struct common_audit_data {
 #endif
 		char *kmod_name;
 		struct lsm_ioctlop_audit *op;
-		struct file *file;
+		const struct file *file;
 		struct lsm_ibpkey_audit *ibpkey;
 		struct lsm_ibendport_audit *ibendport;
 		int reason;
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 8c42b4bde09c..2c4da40757ad 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -191,6 +191,9 @@ LSM_HOOK(int, 0, file_permission, struct file *file, int mask)
 LSM_HOOK(int, 0, file_alloc_security, struct file *file)
 LSM_HOOK(void, LSM_RET_VOID, file_release, struct file *file)
 LSM_HOOK(void, LSM_RET_VOID, file_free_security, struct file *file)
+LSM_HOOK(int, 0, backing_file_alloc, void *backing_file_blobp,
+	 const struct file *user_file)
+LSM_HOOK(void, LSM_RET_VOID, backing_file_free, void *backing_file_blobp)
 LSM_HOOK(int, 0, file_ioctl, struct file *file, unsigned int cmd,
 	 unsigned long arg)
 LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
@@ -198,6 +201,8 @@ LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
 LSM_HOOK(int, 0, mmap_addr, unsigned long addr)
 LSM_HOOK(int, 0, mmap_file, struct file *file, unsigned long reqprot,
 	 unsigned long prot, unsigned long flags)
+LSM_HOOK(int, 0, mmap_backing_file, struct vm_area_struct *vma,
+	 struct file *backing_file, struct file *user_file)
 LSM_HOOK(int, 0, file_mprotect, struct vm_area_struct *vma,
 	 unsigned long reqprot, unsigned long prot)
 LSM_HOOK(int, 0, file_lock, struct file *file, unsigned int cmd)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index d48bf0ad26f4..b4f8cad53ddb 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -104,6 +104,7 @@ struct security_hook_list {
 struct lsm_blob_sizes {
 	unsigned int lbs_cred;
 	unsigned int lbs_file;
+	unsigned int lbs_backing_file;
 	unsigned int lbs_ib;
 	unsigned int lbs_inode;
 	unsigned int lbs_sock;
diff --git a/include/linux/security.h b/include/linux/security.h
index 83a646d72f6f..0a726bb70479 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -471,11 +471,17 @@ int security_file_permission(struct file *file, int mask);
 int security_file_alloc(struct file *file);
 void security_file_release(struct file *file);
 void security_file_free(struct file *file);
+int security_backing_file_alloc(void **backing_file_blobp,
+				const struct file *user_file);
+void security_backing_file_free(void **backing_file_blobp);
 int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
 int security_file_ioctl_compat(struct file *file, unsigned int cmd,
 			       unsigned long arg);
 int security_mmap_file(struct file *file, unsigned long prot,
 			unsigned long flags);
+int security_mmap_backing_file(struct vm_area_struct *vma,
+			       struct file *backing_file,
+			       struct file *user_file);
 int security_mmap_addr(unsigned long addr);
 int security_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
 			   unsigned long prot);
@@ -1140,6 +1146,15 @@ static inline void security_file_release(struct file *file)
 static inline void security_file_free(struct file *file)
 { }
 
+static inline int security_backing_file_alloc(void **backing_file_blobp,
+					      const struct file *user_file)
+{
+	return 0;
+}
+
+static inline void security_backing_file_free(void **backing_file_blobp)
+{ }
+
 static inline int security_file_ioctl(struct file *file, unsigned int cmd,
 				      unsigned long arg)
 {
@@ -1159,6 +1174,13 @@ static inline int security_mmap_file(struct file *file, unsigned long prot,
 	return 0;
 }
 
+static inline int security_mmap_backing_file(struct vm_area_struct *vma,
+					     struct file *backing_file,
+					     struct file *user_file)
+{
+	return 0;
+}
+
 static inline int security_mmap_addr(unsigned long addr)
 {
 	return cap_mmap_addr(addr);
diff --git a/security/lsm.h b/security/lsm.h
index db77cc83e158..32f808ad4335 100644
--- a/security/lsm.h
+++ b/security/lsm.h
@@ -29,6 +29,7 @@ extern struct lsm_blob_sizes blob_sizes;
 
 /* LSM blob caches */
 extern struct kmem_cache *lsm_file_cache;
+extern struct kmem_cache *lsm_backing_file_cache;
 extern struct kmem_cache *lsm_inode_cache;
 
 /* LSM blob allocators */
diff --git a/security/lsm_init.c b/security/lsm_init.c
index 573e2a7250c4..7c0fd17f1601 100644
--- a/security/lsm_init.c
+++ b/security/lsm_init.c
@@ -293,6 +293,8 @@ static void __init lsm_prepare(struct lsm_info *lsm)
 	blobs = lsm->blobs;
 	lsm_blob_size_update(&blobs->lbs_cred, &blob_sizes.lbs_cred);
 	lsm_blob_size_update(&blobs->lbs_file, &blob_sizes.lbs_file);
+	lsm_blob_size_update(&blobs->lbs_backing_file,
+			     &blob_sizes.lbs_backing_file);
 	lsm_blob_size_update(&blobs->lbs_ib, &blob_sizes.lbs_ib);
 	/* inode blob gets an rcu_head in addition to LSM blobs. */
 	if (blobs->lbs_inode && blob_sizes.lbs_inode == 0)
@@ -441,6 +443,8 @@ int __init security_init(void)
 	if (lsm_debug) {
 		lsm_pr("blob(cred) size %d\n", blob_sizes.lbs_cred);
 		lsm_pr("blob(file) size %d\n", blob_sizes.lbs_file);
+		lsm_pr("blob(backing_file) size %d\n",
+		       blob_sizes.lbs_backing_file);
 		lsm_pr("blob(ib) size %d\n", blob_sizes.lbs_ib);
 		lsm_pr("blob(inode) size %d\n", blob_sizes.lbs_inode);
 		lsm_pr("blob(ipc) size %d\n", blob_sizes.lbs_ipc);
@@ -462,6 +466,11 @@ int __init security_init(void)
 		lsm_file_cache = kmem_cache_create("lsm_file_cache",
 						   blob_sizes.lbs_file, 0,
 						   SLAB_PANIC, NULL);
+	if (blob_sizes.lbs_backing_file)
+		lsm_backing_file_cache = kmem_cache_create(
+						   "lsm_backing_file_cache",
+						   blob_sizes.lbs_backing_file,
+						   0, SLAB_PANIC, NULL);
 	if (blob_sizes.lbs_inode)
 		lsm_inode_cache = kmem_cache_create("lsm_inode_cache",
 						    blob_sizes.lbs_inode, 0,
diff --git a/security/security.c b/security/security.c
index 67af9228c4e9..651a0d643c9f 100644
--- a/security/security.c
+++ b/security/security.c
@@ -81,6 +81,7 @@ const struct lsm_id *lsm_idlist[MAX_LSM_COUNT];
 struct lsm_blob_sizes blob_sizes;
 
 struct kmem_cache *lsm_file_cache;
+struct kmem_cache *lsm_backing_file_cache;
 struct kmem_cache *lsm_inode_cache;
 
 #define SECURITY_HOOK_ACTIVE_KEY(HOOK, IDX) security_hook_active_##HOOK##_##IDX
@@ -172,6 +173,28 @@ static int lsm_file_alloc(struct file *file)
 	return 0;
 }
 
+/**
+ * lsm_backing_file_alloc - allocate a composite backing file blob
+ * @backing_file_blobp: pointer to the backing file LSM blob pointer
+ *
+ * Allocate the backing file blob for all the modules.
+ *
+ * Returns 0, or -ENOMEM if memory can't be allocated.
+ */
+static int lsm_backing_file_alloc(void **backing_file_blobp)
+{
+	if (!lsm_backing_file_cache) {
+		*backing_file_blobp = NULL;
+		return 0;
+	}
+
+	*backing_file_blobp = kmem_cache_zalloc(lsm_backing_file_cache,
+						GFP_KERNEL);
+	if (*backing_file_blobp == NULL)
+		return -ENOMEM;
+	return 0;
+}
+
 /**
  * lsm_blob_alloc - allocate a composite blob
  * @dest: the destination for the blob
@@ -2417,6 +2440,57 @@ void security_file_free(struct file *file)
 	}
 }
 
+/**
+ * security_backing_file_alloc() - Allocate and setup a backing file blob
+ * @backing_file_blobp: pointer to the backing file LSM blob pointer
+ * @user_file: the associated user visible file
+ *
+ * Allocate a backing file LSM blob and perform any necessary initialization of
+ * the LSM blob.  There will be some operations where the LSM will not have
+ * access to @user_file after this point, so any important state associated
+ * with @user_file that is important to the LSM should be captured in the
+ * backing file's LSM blob.
+ *
+ * LSM's should avoid taking a reference to @user_file in this hook as it will
+ * result in problems later when the system attempts to drop/put the file
+ * references due to a circular dependency.
+ *
+ * Return: Return 0 if the hook is successful, negative values otherwise.
+ */
+int security_backing_file_alloc(void **backing_file_blobp,
+				const struct file *user_file)
+{
+	int rc;
+
+	rc = lsm_backing_file_alloc(backing_file_blobp);
+	if (rc)
+		return rc;
+	rc = call_int_hook(backing_file_alloc, *backing_file_blobp, user_file);
+	if (unlikely(rc))
+		security_backing_file_free(backing_file_blobp);
+
+	return rc;
+}
+
+/**
+ * security_backing_file_free() - Free a backing file blob
+ * @backing_file_blobp: pointer to the backing file LSM blob pointer
+ *
+ * Free any LSM state associate with a backing file's LSM blob, including the
+ * blob itself.
+ */
+void security_backing_file_free(void **backing_file_blobp)
+{
+	void *backing_file_blob = *backing_file_blobp;
+
+	call_void_hook(backing_file_free, backing_file_blob);
+
+	if (backing_file_blob) {
+		*backing_file_blobp = NULL;
+		kmem_cache_free(lsm_backing_file_cache, backing_file_blob);
+	}
+}
+
 /**
  * security_file_ioctl() - Check if an ioctl is allowed
  * @file: associated file
@@ -2505,6 +2579,32 @@ int security_mmap_file(struct file *file, unsigned long prot,
 			     flags);
 }
 
+/**
+ * security_mmap_backing_file - Check if mmap'ing a backing file is allowed
+ * @vma: the vm_area_struct for the mmap'd region
+ * @backing_file: the backing file being mmap'd
+ * @user_file: the user file being mmap'd
+ *
+ * Check permissions for a mmap operation on a stacked filesystem.  This hook
+ * is called after the security_mmap_file() and is responsible for authorizing
+ * the mmap on @backing_file.  It is important to note that the mmap operation
+ * on @user_file has already been authorized and the @vma->vm_file has been
+ * set to @backing_file.
+ *
+ * Return: Returns 0 if permission is granted.
+ */
+int security_mmap_backing_file(struct vm_area_struct *vma,
+			       struct file *backing_file,
+			       struct file *user_file)
+{
+	/* recommended by the stackable filesystem devs */
+	if (WARN_ON_ONCE(!(backing_file->f_mode & FMODE_BACKING)))
+		return -EIO;
+
+	return call_int_hook(mmap_backing_file, vma, backing_file, user_file);
+}
+EXPORT_SYMBOL_GPL(security_mmap_backing_file);
+
 /**
  * security_mmap_addr() - Check if mmap'ing an address is allowed
  * @addr: address
-- 
2.53.0


^ permalink raw reply related

* [PATCH v3 0/2] Fix incorrect overlayfs mmap() and mprotect() LSM access controls
From: Paul Moore @ 2026-03-27 22:04 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-fsdevel, linux-unionfs,
	linux-erofs
  Cc: Amir Goldstein, Gao Xiang, Christian Brauner

A very minor update to the v2 patchset[2] posted earlier this week.  The
changelog is below.  The primary reason for posting such a lightly revised
patchset is to drop the "RFC" qualifier as I've had the opportunity to do
additional testing and I'm reasonably happy with the results.  As always,
anyone reading this is welcome, and encouraged, to do any additional
testing they believe might be helpful.

I plan to merge this into lsm/stable-7.0 either later tonight, or sometime
over the weekend, so the patchset has some time in linux-next.  As we're
fairly close to the v7.1 merge window, I may decide to hold this for Linus
until then; let's see how things turn out with linux-next as well as any
additional review comments.

[2] https://lore.kernel.org/linux-security-module/20260323042510.3331778-4-paul@paul-moore.com/

--
CHANGELOG:
v3:
- fix the LSM hook stubs (kernel robot, Ryan Lee)
- fix the lsm_backing_file_cache allocation size (Ryan Lee)
- minor style, simplicity tweaks to the SELinux patch
v2:
- remove the user O_PATH file patch from Amir
- add the backing_file LSM blob and lifecycle hooks
- update the SELinux code to reflect the other changes
v1:
- initial version

--
Paul Moore (2):
      lsm: add backing_file LSM hooks
      selinux: fix overlayfs mmap() and mprotect() access checks

 fs/backing-file.c                 |   18 +-
 fs/erofs/ishare.c                 |   10 +
 fs/file_table.c                   |   21 ++
 fs/fuse/passthrough.c             |    2 
 fs/internal.h                     |    3 
 fs/overlayfs/dir.c                |    2 
 fs/overlayfs/file.c               |    2 
 include/linux/backing-file.h      |    4 
 include/linux/fs.h                |    1 
 include/linux/lsm_audit.h         |    2 
 include/linux/lsm_hook_defs.h     |    5 
 include/linux/lsm_hooks.h         |    1 
 include/linux/security.h          |   22 ++
 security/lsm.h                    |    1 
 security/lsm_init.c               |    9 +
 security/security.c               |  100 +++++++++++
 security/selinux/hooks.c          |  256 +++++++++++++++++++++---------
 security/selinux/include/objsec.h |   17 +
 18 files changed, 389 insertions(+), 87 deletions(-)


^ permalink raw reply

* Re: [PATCH v3 4/9] lsm: framework for BPF integrity verification
From: Song Liu @ 2026-03-27 18:24 UTC (permalink / raw)
  To: Blaise Boscaccy
  Cc: Jonathan Corbet, Paul Moore, James Morris, Serge E. Hallyn,
	Mickaël Salaün, Günther Noack,
	Dr. David Alan Gilbert, Andrew Morton, James.Bottomley, dhowells,
	Fan Wu, Ryan Foster, Randy Dunlap, linux-security-module,
	linux-doc, linux-kernel, bpf
In-Reply-To: <871ph5f99z.fsf@microsoft.com>

On Fri, Mar 27, 2026 at 10:54 AM Blaise Boscaccy
<bboscaccy@linux.microsoft.com> wrote:
>
> Song Liu <song@kernel.org> writes:
>
> > On Wed, Mar 25, 2026 at 11:07 PM Blaise Boscaccy
> > <bboscaccy@linux.microsoft.com> wrote:
> > [...]
> >> The first new callback, bpf_prog_load_integrity(), located within the
> >> security_bpf_prog_load() hook, is necessary to ensure that the integrity
> >> verification callbacks are executed before any of the existing LSMs
> >> are executed via the bpf_prog_load() callback.  Reusing the existing
> >> bpf_prog_load() callback for integrity verification could result in LSMs
> >> not having access to the integrity verification results when asked to
> >> authorize the BPF program load in the bpf_prog_load() callback.
> >>
> >> The new LSM hook, security_bpf_prog_load_post_integrity(), is intended
> >> to be called from within LSMs performing BPF program integrity
> >> verification.  It is used to report the verdict of the integrity
> >> verification to other LSMs enforcing access control policy on BPF
> >> program loads.  LSMs enforcing such access controls should register a
> >> bpf_prog_load_post_integrity() callback to receive integrity verdicts.
> >
> > bpf_prog_load_post_integrity() is weird. Some questions about it:
> >
> > 1. Is it possible to call it from other LSMs (not hornet)? Specifically, is it
> >    possible to call it from BPF LSM?
>
> There is nothing hornet exclusive about that security hook. If the BPF
> LSM folks wanted to use it they would probably need to implement a
> kfunc to invoke it.

Please also include the kfunc in v4.

Thanks,
Song

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox