From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christian Brauner Date: Sat, 18 May 2019 10:04:46 +0000 Subject: Re: [PATCH v1 1/2] pid: add pidfd_open() Message-Id: <20190518100435.c5bqpcnra53dsr6p@brauner.io> List-Id: References: <20190516135944.7205-1-christian@brauner.io> <20190516224949.GA15401@localhost> In-Reply-To: <20190516224949.GA15401@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable To: Joel Fernandes Cc: jannh@google.com, oleg@redhat.com, viro@zeniv.linux.org.uk, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, cyphar@cyphar.com, dhowells@redhat.com, ebiederm@xmission.com, elena.reshetova@intel.com, keescook@chromium.org, luto@amacapital.net, luto@kernel.org, tglx@linutronix.de, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.orgg, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, dancol@google.com, serge@hallyn.com, su On Sat, May 18, 2019 at 05:48:03AM -0400, Joel Fernandes wrote: > Hi Christian, >=20 > For next revision, could you also CC surenb@google.com as well? He is also > working on the low memory killer. And also suggest CC to > kernel-team@android.com. And mentioned some comments below, thanks. Yip, totally. Just added them both to my Cc list. :) (I saw you added Suren manually. I added the Android kernel team now too.) >=20 > On Thu, May 16, 2019 at 03:59:42PM +0200, Christian Brauner wrote: > [snip] =20 > > diff --git a/kernel/pid.c b/kernel/pid.c > > index 20881598bdfa..4afca3d6dcb8 100644 > > --- a/kernel/pid.c > > +++ b/kernel/pid.c > > @@ -38,6 +38,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > =20 > > @@ -451,6 +452,55 @@ struct pid *find_ge_pid(int nr, struct pid_namespa= ce *ns) > > return idr_get_next(&ns->idr, &nr); > > } > > =20 > > +/** > > + * pidfd_open() - Open new pid file descriptor. > > + * > > + * @pid: pid for which to retrieve a pidfd > > + * @flags: flags to pass > > + * > > + * This creates a new pid file descriptor with the O_CLOEXEC flag set = for > > + * the process identified by @pid. Currently, the process identified by > > + * @pid must be a thread-group leader. This restriction currently exis= ts > > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD can= not > > + * be used with CLONE_THREAD) and pidfd polling (only supports thread = group > > + * leaders). > > + * > > + * Return: On success, a cloexec pidfd is returned. > > + * On error, a negative errno number will be returned. > > + */ > > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) > > +{ > > + int fd, ret; > > + struct pid *p; > > + struct task_struct *tsk; > > + > > + if (flags) > > + return -EINVAL; > > + > > + if (pid <=3D 0) > > + return -EINVAL; > > + > > + p =3D find_get_pid(pid); > > + if (!p) > > + return -ESRCH; > > + > > + ret =3D 0; > > + rcu_read_lock(); > > + /* > > + * If this returns non-NULL the pid was used as a thread-group > > + * leader. Note, we race with exec here: If it changes the > > + * thread-group leader we might return the old leader. > > + */ > > + tsk =3D pid_task(p, PIDTYPE_TGID); >=20 > Just trying to understand the comment here. The issue is that we might ei= ther > return the new leader, or the old leader depending on the overlap with > concurrent de_thread right? In either case, we don't care though. >=20 > I suggest to remove the "Note..." part of the comment since it doesn't se= em the > race is relevant here unless we are doing something else with tsk in the > function, but if you want to keep it that's also fine. Comment text should > probably should be 'return the new leader' though. Nah, I actually removed the comment already independently (cf. see [1]). [1]: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/comm= it/?h=3Dpidfd_open&id=DCfc98c2d957bf3ac14b06414cb2cf4c673fc297 >=20 > > + if (!tsk) > > + ret =3D -ESRCH; >=20 > Perhaps -EINVAL? AFAICS, this can only happen if a CLONE_THREAD pid was > passed as argument to pidfd_open which is invalid. But let me know what y= ou > had in mind.. Hm, from the kernel's perspective ESRCH is correct but I guess EINVAL is nicer for userspace. So I don't have objections to using EINVAL. My first version did too I think. Thanks! Christian