From: Oleg Nesterov <oleg@redhat.com>
To: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Christian Brauner <brauner@kernel.org>,
Shuah Khan <shuah@kernel.org>, Kees Cook <kees@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Jan Kara <jack@suse.cz>, Aleksa Sarai <cyphar@cyphar.com>,
Andrei Vagin <avagin@google.com>, Kirill Tkhai <tkhai@ya.ru>,
Alexander Mikhalitsyn <alexander@mihalicyn.com>,
Adrian Reber <areber@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-kselftest@vger.kernel.org,
Mateusz Guzik <mjguzik@gmail.com>
Subject: Re: [PATCH v2 1/2] pid_namespace: allow opening pid_for_children before init was created
Date: Tue, 24 Feb 2026 17:03:41 +0100 [thread overview]
Message-ID: <aZ3L3cRL8AEmfQpP@redhat.com> (raw)
In-Reply-To: <3f095a91-f052-4f38-a8e4-2e6dbc9a0c6a@virtuozzo.com>
On 02/24, Pavel Tikhomirov wrote:
>
> On 2/24/26 13:09, Oleg Nesterov wrote:
> > On 02/23, Pavel Tikhomirov wrote:
> >>
> >> To avoid possible problems related to cpu/compiler optimizations around
> >> ->child_reaper, let's use WRITE_ONCE (additional to task_list lock)
> >> everywhere we write it and use READ_ONCE everywhere we read it without
> >> explicit lock.
> >
> > Yes, this is what I meant... but I can never recall if READ_ONCE() alone
> > is enough to make KCSAN happy...
>
> AFAICS this should be fine for memory safety of accesses to ->child_reaper.
> I would love if someone more experienced in this area would confirm.
__READ_ONCE() uses volatile cast, DEFINE_TSAN_VOLATILE_READ_WRITE()
will pass KCSAN_ACCESS_ATOMIC to check_access(), so it seems that
READ_ONCE() should be enough...
But I am not sure, I don't really understand this code.
> > I won't insist, but I think it would be better to do this in a separate
> > change for documenation purposes and for discussion.
>
> Ok, will do. It will be a bit ugly as I will first add READ_ONCE to the
> pidns_for_children_get() and then remove the hunk with it in the next patch.
Agreed, this is ugly. I almost regret I mentioned _ONCE() in the previous
discussions, I only tried to "nack" another read_lock(tasklist).
So lets avoid a separate change and WRITE_ONCE()'s in copy_process/find_child_reaper,
we can add them later if KCSAN complains, they are not needed for correctness.
But up to you, I am fine either way.
> > Perhaps something like the preparational patch below makes sense ? Not
> > sure this is actually better...
>
> This looks more universal at least, as instead of two checks we have one in one
> place. My only concern of putting the check where I put it was to avoid extra
> idr_alloc_cyclic() + idr_remove(), if we already know we don't need it. But it's
> only in last pid_namespace we can have ->child_reaper unset so we do alloc/remove
> for all other namespaces anyway in error case, should not be a big deal.
Yes...
> I will add the preparation patches: for below patch and related to _ONCE.
Again, up to you. But either way it would be nice to have a comment or at
least a note in the changelog to explain that this is also needed to avoid
the race between alloc_pid() + fail and another alloc_pid(). This is subtle.
Oleg.
next prev parent reply other threads:[~2026-02-24 16:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 20:01 [PATCH v2 0/2] pid_namespace: make init creation more flexible Pavel Tikhomirov
2026-02-23 20:01 ` [PATCH v2 1/2] pid_namespace: allow opening pid_for_children before init was created Pavel Tikhomirov
2026-02-24 7:02 ` Andrei Vagin
2026-02-24 10:37 ` Pavel Tikhomirov
2026-02-24 15:38 ` Andrei Vagin
2026-02-24 16:09 ` Oleg Nesterov
2026-02-24 12:09 ` Oleg Nesterov
2026-02-24 13:23 ` Pavel Tikhomirov
2026-02-24 16:03 ` Oleg Nesterov [this message]
2026-02-24 16:35 ` Pavel Tikhomirov
2026-02-23 20:01 ` [PATCH v2 2/2] selftests: Add tests for creating pidns init via setns Pavel Tikhomirov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZ3L3cRL8AEmfQpP@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alexander@mihalicyn.com \
--cc=areber@redhat.com \
--cc=avagin@google.com \
--cc=brauner@kernel.org \
--cc=cyphar@cyphar.com \
--cc=david@kernel.org \
--cc=jack@suse.cz \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=mjguzik@gmail.com \
--cc=peterz@infradead.org \
--cc=ptikhomirov@virtuozzo.com \
--cc=shuah@kernel.org \
--cc=tkhai@ya.ru \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.