* [PATCH 1/4] selftests/namespace: fix selftest hang-up caused by zombie processes
2026-04-05 16:50 [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
@ 2026-04-05 16:50 ` Yohei Kojima
2026-04-05 16:50 ` [PATCH 2/4] selftests/namespace: fix unintentional skip in ns_active_ref_test.c Yohei Kojima
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Yohei Kojima @ 2026-04-05 16:50 UTC (permalink / raw)
To: Christian Brauner, Shuah Khan; +Cc: Yohei Kojima, linux-kselftest, linux-kernel
Fix zombie grandchild processes spawned by timens_separate and
pidns_separate tests in nsid_test.c. This also prevents kselftest from
hanging up after running these tests.
Signed-off-by: Yohei Kojima <yk@y-koj.net>
---
tools/testing/selftests/namespaces/nsid_test.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/testing/selftests/namespaces/nsid_test.c b/tools/testing/selftests/namespaces/nsid_test.c
index b4a14c6693a5..1e2669372b7b 100644
--- a/tools/testing/selftests/namespaces/nsid_test.c
+++ b/tools/testing/selftests/namespaces/nsid_test.c
@@ -25,10 +25,12 @@
/* Fixture for tests that create child processes */
FIXTURE(nsid) {
pid_t child_pid;
+ pid_t grandchild_pid;
};
FIXTURE_SETUP(nsid) {
self->child_pid = 0;
+ self->grandchild_pid = 0;
}
FIXTURE_TEARDOWN(nsid) {
@@ -37,6 +39,10 @@ FIXTURE_TEARDOWN(nsid) {
kill(self->child_pid, SIGKILL);
waitpid(self->child_pid, NULL, 0);
}
+ if (self->grandchild_pid > 0) {
+ kill(self->grandchild_pid, SIGKILL);
+ waitpid(self->grandchild_pid, NULL, 0);
+ }
}
TEST(nsid_mntns_basic)
@@ -677,6 +683,7 @@ TEST_F(nsid, timens_separate)
pid_t grandchild_pid;
ASSERT_EQ(read(pipefd[0], &grandchild_pid, sizeof(grandchild_pid)), sizeof(grandchild_pid));
close(pipefd[0]);
+ self->grandchild_pid = grandchild_pid;
/* Open grandchild's time namespace */
char path[256];
@@ -798,6 +805,7 @@ TEST_F(nsid, pidns_separate)
pid_t grandchild_pid;
ASSERT_EQ(read(pipefd[0], &grandchild_pid, sizeof(grandchild_pid)), sizeof(grandchild_pid));
close(pipefd[0]);
+ self->grandchild_pid = grandchild_pid;
/* Open grandchild's PID namespace */
char path[256];
--
2.52.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 2/4] selftests/namespace: fix unintentional skip in ns_active_ref_test.c
2026-04-05 16:50 [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
2026-04-05 16:50 ` [PATCH 1/4] selftests/namespace: fix selftest hang-up caused by zombie processes Yohei Kojima
@ 2026-04-05 16:50 ` Yohei Kojima
2026-04-05 16:50 ` [PATCH 3/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Yohei Kojima @ 2026-04-05 16:50 UTC (permalink / raw)
To: Christian Brauner, Shuah Khan; +Cc: Yohei Kojima, linux-kselftest, linux-kernel
Fix ESTALE from open_by_handle_at() in ns_multiple_children_same_parent
when child processes exit before the parent run it.
Signed-off-by: Yohei Kojima <yk@y-koj.net>
---
tools/testing/selftests/namespaces/ns_active_ref_test.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/namespaces/ns_active_ref_test.c b/tools/testing/selftests/namespaces/ns_active_ref_test.c
index 093268f0efaa..29d96a6e8100 100644
--- a/tools/testing/selftests/namespaces/ns_active_ref_test.c
+++ b/tools/testing/selftests/namespaces/ns_active_ref_test.c
@@ -1193,6 +1193,10 @@ TEST(ns_multiple_children_same_parent)
write(pipefd[1], &c1_id, sizeof(c1_id));
write(pipefd[1], &c2_id, sizeof(c2_id));
close(pipefd[1]);
+
+ /* give parent a time to run open_by_handle_at() */
+ usleep(10000);
+
exit(0);
}
--
2.52.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 3/4] nstree: Fix spurious ENOENT in listns pagination during grace period
2026-04-05 16:50 [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
2026-04-05 16:50 ` [PATCH 1/4] selftests/namespace: fix selftest hang-up caused by zombie processes Yohei Kojima
2026-04-05 16:50 ` [PATCH 2/4] selftests/namespace: fix unintentional skip in ns_active_ref_test.c Yohei Kojima
@ 2026-04-05 16:50 ` Yohei Kojima
2026-04-05 16:50 ` [PATCH 4/4] selftests/namespace: test spurious ENOENT bug in listns pagination Yohei Kojima
2026-04-07 12:57 ` [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
4 siblings, 0 replies; 7+ messages in thread
From: Yohei Kojima @ 2026-04-05 16:50 UTC (permalink / raw)
To: Christian Brauner; +Cc: Yohei Kojima, linux-kernel
Fix false ENOENT returned from listns when (1) pagination is used
(req.ns_id != 0) and (2) listns tries to start enumeration from a
destroyed or inactive namespace.
The cause was that lookup_ns_id_at(kls->last_ns_id + 1, ...) returned
NULL if the first namespace after ns_id was destroyed or inactivated
like below: (Note that we can take nstree as a list as it is an rbtree
sorted by ns id.)
A: active namespace
D: destroyed (or inactive) namespace
+-----+-----+-----+-----+-----+-----+-----+-----+
state: | A | A | A | D | D | A | A | A |
ns_id: | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
+-----+-----+-----+-----+-----+-----+-----+-----+
| |
| +-- (kls->last_ns_id + 1)
+-- req.ns_id = 3
In this case, lookup_ns_id_at() returns NULL, which results in -ENOENT
returned from do_listns() although three namespaces remains in the nstree.
The bug is fixed by iterating over the nstree's internal list until it
reaches the first active namespace.
Fixes: 76b6f5dfb3fd ("nstree: add listns()")
Signed-off-by: Yohei Kojima <yk@y-koj.net>
---
kernel/nstree.c | 68 +++++++++++++++++++++++++++++++++----------------
1 file changed, 46 insertions(+), 22 deletions(-)
diff --git a/kernel/nstree.c b/kernel/nstree.c
index 6d12e5900ac0..476d22203ee0 100644
--- a/kernel/nstree.c
+++ b/kernel/nstree.c
@@ -618,14 +618,32 @@ static ssize_t do_listns_userns(struct klistns *kls)
return ret;
}
+static inline struct ns_common *next_ns_common(struct ns_common *ns,
+ struct ns_tree_root *ns_tree)
+{
+ if (ns_tree)
+ return list_entry_rcu(ns->ns_tree_node.ns_list_entry.next, struct ns_common, ns_tree_node.ns_list_entry);
+ return list_entry_rcu(ns->ns_unified_node.ns_list_entry.next, struct ns_common, ns_unified_node.ns_list_entry);
+}
+
+static inline bool ns_common_is_head(struct ns_common *ns,
+ const struct list_head *head,
+ struct ns_tree_root *ns_tree)
+{
+ if (ns_tree)
+ return &ns->ns_tree_node.ns_list_entry == head;
+ return &ns->ns_unified_node.ns_list_entry == head;
+}
+
/*
* Lookup a namespace with id >= ns_id in either the unified tree or a type-specific tree.
* Returns the namespace with the smallest id that is >= ns_id.
*/
static struct ns_common *lookup_ns_id_at(u64 ns_id, int ns_type)
{
- struct ns_common *ret = NULL;
+ struct ns_common *min = NULL, *ret = NULL;
struct ns_tree_root *ns_tree = NULL;
+ struct list_head *head;
struct rb_node *node;
if (ns_type) {
@@ -651,9 +669,9 @@ static struct ns_common *lookup_ns_id_at(u64 ns_id, int ns_type)
if (ns_id <= ns->ns_id) {
if (ns_type)
- ret = node_to_ns(node);
+ min = node_to_ns(node);
else
- ret = node_to_ns_unified(node);
+ min = node_to_ns_unified(node);
if (ns_id == ns->ns_id)
break;
node = node->rb_left;
@@ -662,8 +680,31 @@ static struct ns_common *lookup_ns_id_at(u64 ns_id, int ns_type)
}
}
- if (ret)
- ret = ns_get_unless_inactive(ret);
+ if (!min)
+ return NULL;
+ /*
+ * Now min->ns_id is the minimum id where min->ns_id >= ns_id holds,
+ * but min could be inactive or destroyed here, therefore
+ * ns_get_unless_inactive(min) could return NULL.
+ *
+ * To handle this case, try acquiring the next ns until it reaches the
+ * first valid ns.
+ */
+ if (ns_tree)
+ head = &ns_tree->ns_list_head;
+ else
+ head = &ns_unified_root.ns_list_head;
+
+ while (!ns_common_is_head(min, head, ns_tree)) {
+ ret = ns_get_unless_inactive(min);
+ if (ret)
+ break;
+
+ rcu_read_lock();
+ min = next_ns_common(min, ns_tree);
+ rcu_read_unlock();
+ }
+
return ret;
}
@@ -675,23 +716,6 @@ static inline struct ns_common *first_ns_common(const struct list_head *head,
return list_entry_rcu(head->next, struct ns_common, ns_unified_node.ns_list_entry);
}
-static inline struct ns_common *next_ns_common(struct ns_common *ns,
- struct ns_tree_root *ns_tree)
-{
- if (ns_tree)
- return list_entry_rcu(ns->ns_tree_node.ns_list_entry.next, struct ns_common, ns_tree_node.ns_list_entry);
- return list_entry_rcu(ns->ns_unified_node.ns_list_entry.next, struct ns_common, ns_unified_node.ns_list_entry);
-}
-
-static inline bool ns_common_is_head(struct ns_common *ns,
- const struct list_head *head,
- struct ns_tree_root *ns_tree)
-{
- if (ns_tree)
- return &ns->ns_tree_node.ns_list_entry == head;
- return &ns->ns_unified_node.ns_list_entry == head;
-}
-
static ssize_t do_listns(struct klistns *kls)
{
u64 __user *ns_ids = kls->uns_ids;
--
2.52.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 4/4] selftests/namespace: test spurious ENOENT bug in listns pagination
2026-04-05 16:50 [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
` (2 preceding siblings ...)
2026-04-05 16:50 ` [PATCH 3/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
@ 2026-04-05 16:50 ` Yohei Kojima
2026-04-07 12:57 ` [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
4 siblings, 0 replies; 7+ messages in thread
From: Yohei Kojima @ 2026-04-05 16:50 UTC (permalink / raw)
To: Christian Brauner, Shuah Khan; +Cc: Yohei Kojima, linux-kselftest, linux-kernel
Test spurious ENOENT which occurs when it tried to start pagination from
an inactivated or destroyed namespace. The new test is almost identical
to pagination_with_type_filter, except that it calls run_noisy_children()
which creates and lists namespaces to disturb nstree.
As far as the author tested, this bug only reproduced on a baremetal
environment, probably because the test relies on the RCU behavior and
the kernel behaves differently on VM.
Signed-off-by: Yohei Kojima <yk@y-koj.net>
---
| 200 ++++++++++++++++++
1 file changed, 200 insertions(+)
--git a/tools/testing/selftests/namespaces/listns_pagination_bug.c b/tools/testing/selftests/namespaces/listns_pagination_bug.c
index da7d33f96397..f71d8f4d64bb 100644
--- a/tools/testing/selftests/namespaces/listns_pagination_bug.c
+++ b/tools/testing/selftests/namespaces/listns_pagination_bug.c
@@ -135,4 +135,204 @@ TEST(pagination_with_type_filter)
}
}
+static void run_noisy_children(int num_workers)
+{
+ struct ns_id_req req = {
+ .size = sizeof(req),
+ .spare = 0,
+ .ns_id = 0,
+ .ns_type = CLONE_NEWUSER, /* Filter by user namespace */
+ .spare2 = 0,
+ .user_ns_id = 0,
+ };
+ pid_t pids[num_workers];
+ int num_forked = 0;
+ int i;
+
+ /*
+ * Create worker processes that do concurrent operations;
+ * most of this part is borrowed from concurrent_namespace_operations
+ * test in stress_test.c
+ */
+ for (i = 0; i < num_workers; i++) {
+ pids[i] = fork();
+ if (pids[i] < 0)
+ goto failure;
+ if (pids[i] > 0)
+ num_forked++;
+
+ if (pids[i] == 0) {
+ /* Each worker: create namespaces, list them, repeat */
+ int iterations;
+
+ for (iterations = 0; iterations < 10; iterations++) {
+ int userns_fd;
+ __u64 temp_ns_ids[100];
+ ssize_t ret;
+
+ /* Create a user namespace */
+ userns_fd = get_userns_fd(0, getuid(), 1);
+ if (userns_fd < 0)
+ continue;
+
+ /* List namespaces */
+ ret = sys_listns(&req, temp_ns_ids, ARRAY_SIZE(temp_ns_ids), 0);
+ (void)ret;
+
+ close(userns_fd);
+
+ /* Small delay */
+ usleep(1000);
+ }
+
+ exit(0);
+ }
+ }
+
+ /*
+ * Return after waiting for children; this is enough for
+ * reproduction, and help keeping the test code simple.
+ */
+ for (i = 0; i < num_forked; i++)
+ waitpid(pids[i], NULL, 0);
+
+ return;
+
+failure:
+ for (i = 0; i < num_forked; i++)
+ kill(pids[i], SIGKILL);
+ for (i = 0; i < num_forked; i++)
+ waitpid(pids[i], NULL, 0);
+}
+
+/*
+ * A test case to reproduce spurious ENOENT in listns pagination
+ *
+ * The bug occurs when the ns id to start pagination is inactivated or
+ * destroyed before listns is called (or during listns is processed).
+ *
+ * This test is almost identical to pagination_with_type_filter test
+ * except that this calls run_noisy_children().
+ */
+TEST(pagination_during_grace_period)
+{
+ struct ns_id_req req = {
+ .size = sizeof(req),
+ .spare = 0,
+ .ns_id = 0,
+ .ns_type = CLONE_NEWUSER, /* Filter by user namespace */
+ .spare2 = 0,
+ .user_ns_id = 0,
+ };
+ pid_t pids[10];
+ int num_children = 10;
+ const int num_noisy_children = 10;
+ int i;
+ int sv[2];
+ __u64 first_batch[3];
+ ssize_t ret;
+
+ ASSERT_EQ(socketpair(AF_UNIX, SOCK_STREAM, 0, sv), 0);
+
+ run_noisy_children(num_noisy_children);
+
+ /* Create children with user namespaces */
+ for (i = 0; i < num_children; i++) {
+ pids[i] = fork();
+ ASSERT_GE(pids[i], 0);
+
+ if (pids[i] == 0) {
+ char c;
+
+ close(sv[0]);
+
+ if (setup_userns() < 0) {
+ close(sv[1]);
+ exit(1);
+ }
+
+ /* Signal parent we're ready */
+ if (write(sv[1], &c, 1) != 1) {
+ close(sv[1]);
+ exit(1);
+ }
+
+ /* Wait for parent signal to exit */
+ if (read(sv[1], &c, 1) != 1) {
+ close(sv[1]);
+ exit(1);
+ }
+
+ close(sv[1]);
+ exit(0);
+ }
+ }
+
+ close(sv[1]);
+
+ /* Wait for all children to signal ready */
+ for (i = 0; i < num_children; i++) {
+ char c;
+
+ if (read(sv[0], &c, 1) != 1) {
+ close(sv[0]);
+ for (int j = 0; j < num_children; j++)
+ kill(pids[j], SIGKILL);
+ for (int j = 0; j < num_children; j++)
+ waitpid(pids[j], NULL, 0);
+ ASSERT_TRUE(false);
+ }
+ }
+
+ /* First batch - this should work */
+ ret = sys_listns(&req, first_batch, 3, 0);
+ if (ret < 0) {
+ if (errno == ENOSYS) {
+ close(sv[0]);
+ for (i = 0; i < num_children; i++)
+ kill(pids[i], SIGKILL);
+ for (i = 0; i < num_children; i++)
+ waitpid(pids[i], NULL, 0);
+ SKIP(return, "listns() not supported");
+ }
+ ASSERT_GE(ret, 0);
+ }
+
+ TH_LOG("First batch returned %zd entries", ret);
+
+ if (ret == 3) {
+ __u64 second_batch[3];
+
+ /* Second batch - pagination triggers the bug */
+ req.ns_id = first_batch[2]; /* Continue from last ID */
+ ret = sys_listns(&req, second_batch, 3, 0);
+
+ TH_LOG("Second batch returned %zd entries", ret);
+ ASSERT_GE(ret, 0);
+ }
+
+ /* Signal all children to exit */
+ for (i = 0; i < num_children; i++) {
+ char c = 'X';
+
+ if (write(sv[0], &c, 1) != 1) {
+ close(sv[0]);
+ for (int j = i; j < num_children; j++)
+ kill(pids[j], SIGKILL);
+ for (int j = 0; j < num_children; j++)
+ waitpid(pids[j], NULL, 0);
+ ASSERT_TRUE(false);
+ }
+ }
+
+ close(sv[0]);
+
+ /* Cleanup */
+ for (i = 0; i < num_children; i++) {
+ int status;
+
+ waitpid(pids[i], &status, 0);
+ }
+}
+
TEST_HARNESS_MAIN
--
2.52.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period
2026-04-05 16:50 [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
` (3 preceding siblings ...)
2026-04-05 16:50 ` [PATCH 4/4] selftests/namespace: test spurious ENOENT bug in listns pagination Yohei Kojima
@ 2026-04-07 12:57 ` Yohei Kojima
2026-04-09 12:59 ` Christian Brauner
4 siblings, 1 reply; 7+ messages in thread
From: Yohei Kojima @ 2026-04-07 12:57 UTC (permalink / raw)
To: Christian Brauner, Shuah Khan; +Cc: linux-kernel, linux-kselftest
On Mon, Apr 06, 2026 at 01:50:36AM +0900, Yohei Kojima wrote:
> Yohei Kojima (4):
> selftests/namespace: fix selftest hang-up caused by zombie processes
> selftests/namespace: fix unintentional skip in ns_active_ref_test.c
> nstree: Fix spurious ENOENT in listns pagination during grace period
I'm sorry, the subjects of the cover letter and the third patch are
incorrect. This bug is unrelated to the RCU grace period; instead, it
is caused by the handling of inactive and destroyed namespaces. I'll
fix the subject in v2.
Thanks,
Yohei
> selftests/namespace: test spurious ENOENT bug in listns pagination
>
> kernel/nstree.c | 68 ++++--
> .../namespaces/listns_pagination_bug.c | 200 ++++++++++++++++++
> .../selftests/namespaces/ns_active_ref_test.c | 4 +
> .../testing/selftests/namespaces/nsid_test.c | 8 +
> 4 files changed, 258 insertions(+), 22 deletions(-)
>
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period
2026-04-07 12:57 ` [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period Yohei Kojima
@ 2026-04-09 12:59 ` Christian Brauner
0 siblings, 0 replies; 7+ messages in thread
From: Christian Brauner @ 2026-04-09 12:59 UTC (permalink / raw)
To: Yohei Kojima; +Cc: Shuah Khan, linux-kernel, linux-kselftest
On Tue, Apr 07, 2026 at 09:57:38PM +0900, Yohei Kojima wrote:
> On Mon, Apr 06, 2026 at 01:50:36AM +0900, Yohei Kojima wrote:
> > Yohei Kojima (4):
> > selftests/namespace: fix selftest hang-up caused by zombie processes
> > selftests/namespace: fix unintentional skip in ns_active_ref_test.c
> > nstree: Fix spurious ENOENT in listns pagination during grace period
>
> I'm sorry, the subjects of the cover letter and the third patch are
> incorrect. This bug is unrelated to the RCU grace period; instead, it
> is caused by the handling of inactive and destroyed namespaces. I'll
> fix the subject in v2.
Ok, sounds good. We can wait.
^ permalink raw reply [flat|nested] 7+ messages in thread