From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Date: Wed, 19 Aug 2020 13:32:59 +0000 Subject: Re: [PATCH 00/11] Introduce kernel_clone(), kill _do_fork() Message-Id: <87a6yq222c.fsf@x220.int.ebiederm.org> List-Id: References: <20200818173411.404104-1-christian.brauner@ubuntu.com> <20200818174447.GV17456@casper.infradead.org> <20200819074340.GW2674@hirez.programming.kicks-ass.net> <20200819084556.im5zfpm2iquzvzws@wittgenstein> <20200819111851.GY17456@casper.infradead.org> In-Reply-To: <20200819111851.GY17456@casper.infradead.org> (Matthew Wilcox's message of "Wed, 19 Aug 2020 12:18:51 +0100") MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Matthew Wilcox Cc: Christian Brauner , peterz@infradead.org, Christoph Hewllig , linux-kernel@vger.kernel.org, Linus Torvalds , linux-arch@vger.kernel.org, Jonathan Corbet , Yoshinori Sato , Tony Luck , Fenghua Yu , Geert Uytterhoeven , Ley Foon Tan , "David S. Miller" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, Arnd Bergmann , Steven Rostedt , Stafford Horne , Kars de Jong , Kees Cook , Greentime Hu , Mauro Carvalho Chehab , Alexandre Chartre , Masami Hiramatsu , Tom Zanussi , Xiao Yang , linux-doc@vger.kernel.org, uclinux-h8-devel@lists.sourceforge.jp, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, sparclinux@vger.kernel.org, kgdb-bugreport@lists.sourceforge.net, linux-kselftest@vger.kernel.org Matthew Wilcox writes: > On Wed, Aug 19, 2020 at 10:45:56AM +0200, Christian Brauner wrote: >> On Wed, Aug 19, 2020 at 09:43:40AM +0200, peterz@infradead.org wrote: >> > On Tue, Aug 18, 2020 at 06:44:47PM +0100, Matthew Wilcox wrote: >> > > On Tue, Aug 18, 2020 at 07:34:00PM +0200, Christian Brauner wrote: >> > > > The only remaining function callable outside of kernel/fork.c is >> > > > _do_fork(). It doesn't really follow the naming of kernel-internal >> > > > syscall helpers as Christoph righly pointed out. Switch all callers and >> > > > references to kernel_clone() and remove _do_fork() once and for all. >> > > >> > > My only concern is around return type. long, int, pid_t ... can we >> > > choose one and stick to it? pid_t is probably the right return type >> > > within the kernel, despite the return type of clone3(). It'll save us >> > > some work if we ever go through the hassle of growing pid_t beyond 31-bit. >> > >> > We have at least the futex ABI restricting PID space to 30 bits. >> >> Ok, looking into kernel/futex.c I see >> >> pid_t pid = uval & FUTEX_TID_MASK; >> >> which is probably what this referes to and /proc/sys/kernel/threads-max >> is restricted to FUTEX_TID_MASK. >> >> Afaict, that doesn't block switching kernel_clone() to return pid_t. It >> can't create anything > FUTEX_TID_MASK anyway without yelling EAGAIN at >> userspace. But it means that _if_ we were to change the size of pid_t >> we'd likely need a new futex API. > > Yes, there would be a lot of work to do to increase the size of pid_t. > I'd just like to not do anything to make that harder _now_. Stick to > using pid_t within the kernel. Just so people are aware. If you look in include/linux/threads.h you can see that the maximum value of PID_MAX_LIMIT limits pids to 22 bits. Further the design decisions of pids keeps us densly using pids. So I expect it will be a while before we even come close to using 30 bits of pid space. At the same time I do agree that it makes sense to use a consistent type in the kernel to make it easier to read and update the code. Eric