From: Oleg Nesterov <oleg@redhat.com>
To: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Christian Brauner <brauner@kernel.org>,
Shuah Khan <shuah@kernel.org>, Kees Cook <kees@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Jan Kara <jack@suse.cz>, Aleksa Sarai <cyphar@cyphar.com>,
Andrei Vagin <avagin@google.com>, Kirill Tkhai <tkhai@ya.ru>,
Alexander Mikhalitsyn <alexander@mihalicyn.com>,
Adrian Reber <areber@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v3 2/4] pid: check init is created first after idr alloc
Date: Wed, 25 Feb 2026 09:47:19 +0100 [thread overview]
Message-ID: <aZ63F7vHRKw9qZw9@redhat.com> (raw)
In-Reply-To: <20260224164852.306583-3-ptikhomirov@virtuozzo.com>
On 02/24, Pavel Tikhomirov wrote:
>
> This moves the condition (tid != 1 && !tmp->child_reaper) to after idr
> alloc, so it not only covers that first process in pid namespace has pid
> 1 in case of clone3(set_tid) requesting wrong pid, but also if idr
> itself gives wrong pid for some reason.
>
> This could've been the case before this patch, when creating first
> process the alloc_pid()->pidfs_add_pid() code path fails, so that the
> idr->idr_next is non zero anymore and next process calling to
> alloc_pid(), will get 2 as a pid from idr_alloc_cyclic(). Effectively
> leading to init-less pid namespace, which is a bug.
Yes.
alloc_pid() does:
/* On failure to allocate the first pid, reset the state */
if (ns->pid_allocated == PIDNS_ADDING)
idr_set_cursor(&ns->idr, 0);
but this logic is broken.
Suppose that a task P does sys_unshare(CLONE_NEWPID). Then it does
fork(), and fork() fails for any reason after alloc_pid() succeeds.
If P does another fork() to retry, we have a bug.
So with this patch we can either remove the code above, or (better)
improve this logic.
> Note: This is also a preparation for the next patch in the series, which
> will introduce an ability of creating init from the task different to
> the task which had created the pid namespace. Needed to make sure that
> init is always first, even in this new case.
>
> Suggested-by: Oleg Nesterov <oleg@redhat.com>
> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> @@ -296,9 +290,18 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid,
>
> pid->numbers[i].nr = nr;
> pid->numbers[i].ns = tmp;
> - tmp = tmp->parent;
> i--;
> retried_preload = false;
> +
> + /*
> + * PID 1 (init) must be created first.
> + */
> + if (!READ_ONCE(tmp->child_reaper) && nr != 1) {
> + retval = -EINVAL;
> + goto out_free;
> + }
> +
> + tmp = tmp->parent;
> }
Cosmetic, but why did you move "tmp = tmp->parent;" down? This is fine
but not strictly necessary. OTOH, if you do this, perhaps it makes sense
to move "retried_preload = false;" as well?
Oleg.
next prev parent reply other threads:[~2026-02-25 8:47 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-24 16:47 [PATCH v3 0/4] pid_namespace: make init creation more flexible Pavel Tikhomirov
2026-02-24 16:47 ` [PATCH v3 1/4] pid_namespace: avoid optimization of accesses to ->child_reaper Pavel Tikhomirov
2026-02-25 12:24 ` Oleg Nesterov
2026-02-24 16:47 ` [PATCH v3 2/4] pid: check init is created first after idr alloc Pavel Tikhomirov
2026-02-25 0:23 ` Andrei Vagin
2026-02-25 9:06 ` Oleg Nesterov
2026-02-25 10:20 ` Pavel Tikhomirov
2026-02-25 13:06 ` Oleg Nesterov
2026-02-25 13:37 ` Pavel Tikhomirov
2026-02-26 12:46 ` Oleg Nesterov
2026-02-26 14:02 ` Oleg Nesterov
2026-02-26 16:07 ` Oleg Nesterov
2026-02-25 8:47 ` Oleg Nesterov [this message]
2026-02-25 8:57 ` Oleg Nesterov
2026-02-24 16:47 ` [PATCH v3 3/4] pid_namespace: allow opening pid_for_children before init was created Pavel Tikhomirov
2026-02-25 12:55 ` Oleg Nesterov
2026-02-24 16:47 ` [PATCH v3 4/4] selftests: Add tests for creating pidns init via setns Pavel Tikhomirov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZ63F7vHRKw9qZw9@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alexander@mihalicyn.com \
--cc=areber@redhat.com \
--cc=avagin@google.com \
--cc=brauner@kernel.org \
--cc=cyphar@cyphar.com \
--cc=david@kernel.org \
--cc=jack@suse.cz \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=ptikhomirov@virtuozzo.com \
--cc=shuah@kernel.org \
--cc=tkhai@ya.ru \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.