* [PATCH v2 3/4] selftests/landlock: Drain stale audit records on init
2026-04-01 16:14 [PATCH v2 0/4] Fix Landlock audit test flakiness Mickaël Salaün
2026-04-01 16:14 ` [PATCH v2 1/4] selftests/landlock: Fix snprintf truncation checks in audit helpers Mickaël Salaün
2026-04-01 16:14 ` [PATCH v2 2/4] selftests/landlock: Fix socket file descriptor leaks " Mickaël Salaün
@ 2026-04-01 16:14 ` Mickaël Salaün
2026-04-01 16:14 ` [PATCH v2 4/4] selftests/landlock: Skip stale records in audit_match_record() Mickaël Salaün
3 siblings, 0 replies; 5+ messages in thread
From: Mickaël Salaün @ 2026-04-01 16:14 UTC (permalink / raw)
To: Günther Noack
Cc: Mickaël Salaün, linux-security-module, Justin Suess,
Tingmao Wang
Non-audit Landlock tests generate audit records as side effects when
audit_enabled is non-zero (e.g. from boot configuration). These records
accumulate in the kernel audit backlog while no audit daemon socket is
open. When the next test opens a new netlink socket and registers as
the audit daemon, the stale backlog is delivered, causing baseline
record count checks to fail spuriously.
Fix this by draining all pending records in audit_init() right after
setting the receive timeout. The 1-usec SO_RCVTIMEO causes audit_recv()
to return -EAGAIN once the backlog is empty, naturally terminating the
drain loop.
Domain deallocation records are emitted asynchronously from a work
queue, so they may still arrive after the drain. Remove records.domain
== 0 checks that are not preceded by audit_match_record() calls, which
would otherwise consume stale records before the count. Document this
constraint above audit_count_records().
Increasing the drain timeout to catch in-flight deallocation records was
considered but rejected: a longer timeout adds latency to every
audit_init() call even when no stale record is pending, and any fixed
timeout is still not guaranteed to catch all records under load.
Removing the unprotected checks is simpler and avoids the spurious
failures.
Cc: Günther Noack <gnoack@google.com>
Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
Link: https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---
Changes since v1:
https://lore.kernel.org/r/20260312100444.2609563-8-mic@digikod.net
- Also remove domain checks from audit.trace and
scoped_audit.connect_to_child.
- Document records.domain == 0 constraint above
audit_count_records().
- Explain why a longer drain timeout was rejected.
- Drop Reviewed-by (new code comment not in v1).
- Split snprintf and fd leak fixes into separate patches.
---
tools/testing/selftests/landlock/audit.h | 19 +++++++++++++++++++
tools/testing/selftests/landlock/audit_test.c | 2 --
.../testing/selftests/landlock/ptrace_test.c | 1 -
.../landlock/scoped_abstract_unix_test.c | 1 -
4 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
index 6422943fc69e..74e1c3d763be 100644
--- a/tools/testing/selftests/landlock/audit.h
+++ b/tools/testing/selftests/landlock/audit.h
@@ -338,6 +338,15 @@ struct audit_records {
size_t domain;
};
+/*
+ * WARNING: Do not assert records.domain == 0 without a preceding
+ * audit_match_record() call. Domain deallocation records are emitted
+ * asynchronously from kworker threads and can arrive after the drain in
+ * audit_init(), corrupting the domain count. A preceding audit_match_record()
+ * call consumes stale records while scanning, making the assertion safe in
+ * practice because stale deallocation records arrive before the expected access
+ * records.
+ */
static int audit_count_records(int audit_fd, struct audit_records *records)
{
struct audit_message msg;
@@ -393,6 +402,16 @@ static int audit_init(void)
goto err_close;
}
+ /*
+ * Drains stale audit records that accumulated in the kernel backlog
+ * while no audit daemon socket was open. This happens when non-audit
+ * Landlock tests generate records while audit_enabled is non-zero (e.g.
+ * from boot configuration), or when domain deallocation records arrive
+ * asynchronously after a previous test's socket was closed.
+ */
+ while (audit_recv(fd, NULL) == 0)
+ ;
+
return fd;
err_close:
diff --git a/tools/testing/selftests/landlock/audit_test.c b/tools/testing/selftests/landlock/audit_test.c
index 46d02d49835a..f92ba6774faa 100644
--- a/tools/testing/selftests/landlock/audit_test.c
+++ b/tools/testing/selftests/landlock/audit_test.c
@@ -412,7 +412,6 @@ TEST_F(audit_flags, signal)
} else {
EXPECT_EQ(1, records.access);
}
- EXPECT_EQ(0, records.domain);
/* Updates filter rules to match the drop record. */
set_cap(_metadata, CAP_AUDIT_CONTROL);
@@ -601,7 +600,6 @@ TEST_F(audit_exec, signal_and_open)
/* Tests that there was no denial until now. */
EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
EXPECT_EQ(0, records.access);
- EXPECT_EQ(0, records.domain);
/*
* Wait for the child to do a first denied action by layer1 and
diff --git a/tools/testing/selftests/landlock/ptrace_test.c b/tools/testing/selftests/landlock/ptrace_test.c
index 4f64c90583cd..1b6c8b53bf33 100644
--- a/tools/testing/selftests/landlock/ptrace_test.c
+++ b/tools/testing/selftests/landlock/ptrace_test.c
@@ -342,7 +342,6 @@ TEST_F(audit, trace)
/* Makes sure there is no superfluous logged records. */
EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
EXPECT_EQ(0, records.access);
- EXPECT_EQ(0, records.domain);
yama_ptrace_scope = get_yama_ptrace_scope();
ASSERT_LE(0, yama_ptrace_scope);
diff --git a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
index 72f97648d4a7..c47491d2d1c1 100644
--- a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
+++ b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c
@@ -312,7 +312,6 @@ TEST_F(scoped_audit, connect_to_child)
/* Makes sure there is no superfluous logged records. */
EXPECT_EQ(0, audit_count_records(self->audit_fd, &records));
EXPECT_EQ(0, records.access);
- EXPECT_EQ(0, records.domain);
ASSERT_EQ(0, pipe2(pipe_child, O_CLOEXEC));
ASSERT_EQ(0, pipe2(pipe_parent, O_CLOEXEC));
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH v2 4/4] selftests/landlock: Skip stale records in audit_match_record()
2026-04-01 16:14 [PATCH v2 0/4] Fix Landlock audit test flakiness Mickaël Salaün
` (2 preceding siblings ...)
2026-04-01 16:14 ` [PATCH v2 3/4] selftests/landlock: Drain stale audit records on init Mickaël Salaün
@ 2026-04-01 16:14 ` Mickaël Salaün
3 siblings, 0 replies; 5+ messages in thread
From: Mickaël Salaün @ 2026-04-01 16:14 UTC (permalink / raw)
To: Günther Noack
Cc: Mickaël Salaün, linux-security-module, Justin Suess,
Tingmao Wang
Domain deallocation records are emitted asynchronously from kworker
threads (via free_ruleset_work()). Stale deallocation records from a
previous test can arrive during the current test's deallocation read
loop and be picked up by audit_match_record() instead of the expected
record, causing a domain ID mismatch. The audit.layers test (which
creates 16 nested domains) is particularly vulnerable because it reads
16 deallocation records in sequence, providing a large window for stale
records to interleave.
The same issue affects audit_flags.signal, where deallocation records
from a previous test (audit.layers) can leak into the next test and be
picked up by audit_match_record() instead of the expected record.
Fix this by continuing to read records when the type matches but the
content pattern does not. Stale records are silently consumed, and the
loop only stops when both type and pattern match (or the socket times
out with -EAGAIN).
Additionally, extend matches_log_domain_deallocated() with an
expected_domain_id parameter. When set, the regex pattern includes the
specific domain ID as a literal hex value, so that deallocation records
for a different domain do not match the pattern at all. This handles
the case where the stale record has the same denial count as the
expected one (e.g. both have denials=1), which the type+pattern loop
alone cannot distinguish. Callers that already know the expected domain
ID (from a prior denial or allocation record) now pass it to filter
precisely.
When expected_domain_id is set, matches_log_domain_deallocated() also
temporarily increases the socket timeout to audit_tv_dom_drop (1 second)
to wait for the asynchronous kworker deallocation, and restores
audit_tv_default afterward. This removes the need for callers to manage
the timeout switch manually.
Cc: Günther Noack <gnoack@google.com>
Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---
Changes since v1:
- New patch.
---
tools/testing/selftests/landlock/audit.h | 81 ++++++++++++++-----
tools/testing/selftests/landlock/audit_test.c | 32 ++++----
2 files changed, 75 insertions(+), 38 deletions(-)
diff --git a/tools/testing/selftests/landlock/audit.h b/tools/testing/selftests/landlock/audit.h
index 74e1c3d763be..b439a9b00f0e 100644
--- a/tools/testing/selftests/landlock/audit.h
+++ b/tools/testing/selftests/landlock/audit.h
@@ -249,9 +249,9 @@ static __maybe_unused char *regex_escape(const char *const src, char *dst,
static int audit_match_record(int audit_fd, const __u16 type,
const char *const pattern, __u64 *domain_id)
{
- struct audit_message msg;
+ struct audit_message msg, last_mismatch = {};
int ret, err = 0;
- bool matches_record = !type;
+ int num_type_match = 0;
regmatch_t matches[2];
regex_t regex;
@@ -259,21 +259,35 @@ static int audit_match_record(int audit_fd, const __u16 type,
if (ret)
return -EINVAL;
- do {
+ /*
+ * Reads records until one matches both the expected type and the
+ * pattern. Type-matching records with non-matching content are
+ * silently consumed, which handles stale domain deallocation records
+ * from a previous test emitted asynchronously by kworker threads.
+ */
+ while (true) {
memset(&msg, 0, sizeof(msg));
err = audit_recv(audit_fd, &msg);
- if (err)
+ if (err) {
+ if (num_type_match) {
+ printf("DATA: %s\n", last_mismatch.data);
+ printf("ERROR: %d record(s) matched type %u"
+ " but not pattern: %s\n",
+ num_type_match, type, pattern);
+ }
goto out;
+ }
- if (msg.header.nlmsg_type == type)
- matches_record = true;
- } while (!matches_record);
+ if (type && msg.header.nlmsg_type != type)
+ continue;
- ret = regexec(®ex, msg.data, ARRAY_SIZE(matches), matches, 0);
- if (ret) {
- printf("DATA: %s\n", msg.data);
- printf("ERROR: no match for pattern: %s\n", pattern);
- err = -ENOENT;
+ ret = regexec(®ex, msg.data, ARRAY_SIZE(matches), matches,
+ 0);
+ if (!ret)
+ break;
+
+ num_type_match++;
+ last_mismatch = msg;
}
if (domain_id) {
@@ -316,21 +330,48 @@ static int __maybe_unused matches_log_domain_allocated(int audit_fd, pid_t pid,
domain_id);
}
-static int __maybe_unused matches_log_domain_deallocated(
- int audit_fd, unsigned int num_denials, __u64 *domain_id)
+/*
+ * Matches a domain deallocation record. When expected_domain_id is non-zero,
+ * the pattern includes the specific domain ID so that stale deallocation
+ * records from a previous test (with a different domain ID) are skipped by
+ * audit_match_record(), and the socket timeout is temporarily increased to
+ * audit_tv_dom_drop to wait for the asynchronous kworker deallocation.
+ */
+static int __maybe_unused
+matches_log_domain_deallocated(int audit_fd, unsigned int num_denials,
+ __u64 expected_domain_id, __u64 *domain_id)
{
static const char log_template[] = REGEX_LANDLOCK_PREFIX
" status=deallocated denials=%u$";
- char log_match[sizeof(log_template) + 10];
- int log_match_len;
+ static const char log_template_with_id[] =
+ "^audit([0-9.:]\\+): domain=\\(%llx\\)"
+ " status=deallocated denials=%u$";
+ char log_match[sizeof(log_template_with_id) + 32];
+ int log_match_len, err;
+
+ if (expected_domain_id)
+ log_match_len = snprintf(log_match, sizeof(log_match),
+ log_template_with_id,
+ expected_domain_id, num_denials);
+ else
+ log_match_len = snprintf(log_match, sizeof(log_match),
+ log_template, num_denials);
- log_match_len = snprintf(log_match, sizeof(log_match), log_template,
- num_denials);
if (log_match_len >= sizeof(log_match))
return -E2BIG;
- return audit_match_record(audit_fd, AUDIT_LANDLOCK_DOMAIN, log_match,
- domain_id);
+ if (expected_domain_id)
+ setsockopt(audit_fd, SOL_SOCKET, SO_RCVTIMEO,
+ &audit_tv_dom_drop, sizeof(audit_tv_dom_drop));
+
+ err = audit_match_record(audit_fd, AUDIT_LANDLOCK_DOMAIN, log_match,
+ domain_id);
+
+ if (expected_domain_id)
+ setsockopt(audit_fd, SOL_SOCKET, SO_RCVTIMEO, &audit_tv_default,
+ sizeof(audit_tv_default));
+
+ return err;
}
struct audit_records {
diff --git a/tools/testing/selftests/landlock/audit_test.c b/tools/testing/selftests/landlock/audit_test.c
index f92ba6774faa..a76c3686a0fe 100644
--- a/tools/testing/selftests/landlock/audit_test.c
+++ b/tools/testing/selftests/landlock/audit_test.c
@@ -139,13 +139,16 @@ TEST_F(audit, layers)
WEXITSTATUS(status) != EXIT_SUCCESS)
_metadata->exit_code = KSFT_FAIL;
- /* Purges log from deallocated domains. */
- EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
- &audit_tv_dom_drop, sizeof(audit_tv_dom_drop)));
+ /*
+ * Purges log from deallocated domains. Records arrive in LIFO order
+ * (innermost domain first) because landlock_put_hierarchy() walks the
+ * chain sequentially in a single kworker context.
+ */
for (i = ARRAY_SIZE(*domain_stack) - 1; i >= 0; i--) {
__u64 deallocated_dom = 2;
EXPECT_EQ(0, matches_log_domain_deallocated(self->audit_fd, 1,
+ (*domain_stack)[i],
&deallocated_dom));
EXPECT_EQ((*domain_stack)[i], deallocated_dom)
{
@@ -154,8 +157,6 @@ TEST_F(audit, layers)
}
}
EXPECT_EQ(0, munmap(domain_stack, sizeof(*domain_stack)));
- EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
- &audit_tv_default, sizeof(audit_tv_default)));
EXPECT_EQ(0, close(ruleset_fd));
}
@@ -270,13 +271,9 @@ TEST_F(audit, thread)
EXPECT_EQ(0, close(pipe_parent[1]));
ASSERT_EQ(0, pthread_join(thread, NULL));
- EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
- &audit_tv_dom_drop, sizeof(audit_tv_dom_drop)));
- EXPECT_EQ(0, matches_log_domain_deallocated(self->audit_fd, 1,
- &deallocated_dom));
+ EXPECT_EQ(0, matches_log_domain_deallocated(
+ self->audit_fd, 1, denial_dom, &deallocated_dom));
EXPECT_EQ(denial_dom, deallocated_dom);
- EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
- &audit_tv_default, sizeof(audit_tv_default)));
}
FIXTURE(audit_flags)
@@ -432,22 +429,21 @@ TEST_F(audit_flags, signal)
if (variant->restrict_flags &
LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF) {
+ /*
+ * No deallocation record: denials=0 never matches a real
+ * record.
+ */
EXPECT_EQ(-EAGAIN,
- matches_log_domain_deallocated(self->audit_fd, 0,
+ matches_log_domain_deallocated(self->audit_fd, 0, 0,
&deallocated_dom));
EXPECT_EQ(deallocated_dom, 2);
} else {
- EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
- &audit_tv_dom_drop,
- sizeof(audit_tv_dom_drop)));
EXPECT_EQ(0, matches_log_domain_deallocated(self->audit_fd, 2,
+ *self->domain_id,
&deallocated_dom));
EXPECT_NE(deallocated_dom, 2);
EXPECT_NE(deallocated_dom, 0);
EXPECT_EQ(deallocated_dom, *self->domain_id);
- EXPECT_EQ(0, setsockopt(self->audit_fd, SOL_SOCKET, SO_RCVTIMEO,
- &audit_tv_default,
- sizeof(audit_tv_default)));
}
}
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread