Linux userland API discussions
 help / color / mirror / Atom feed
* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
From: Willy Tarreau @ 2019-09-21  3:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man
In-Reply-To: <CALCETrWCjGHKnKikj+YVw22Ufpmnh1TCdGPjG2RL-qzsF=wisA@mail.gmail.com>

On Fri, Sep 20, 2019 at 04:30:20PM -0700, Andy Lutomirski wrote:
> So I think that just improving the
> getrandom()-is-blocking-on-x86-and-arm behavior, adding GRND_INSECURE
> and GRND_SECURE_BLOCKING, and adding the warning if 0 is passed is
> good enough.

I think so as well. Anyway, keep in mind that *with a sane API*,
userland can improve very quickly (faster than kernel deployments in
field). But userland developers need reliable and testable support for
features. If it's enough to do #ifndef GRND_xxx/#define GRND_xxx and
call getrandom() with these flags to detect support, it's basically 5
reliable lines of code to add to userland to make a warning disappear
and/or to allow a system that previously failed to boot to now boot. So
this gives strong incentive to userland to adopt the new API, provided
there's a way for the developer to understand what's happening (which
the warning does).

If we do it right, all we'll hear are userland developers complaining
that those stupid kernel developers have changed their API again and
really don't know what they want. That will be a good sign that the
warning flows back to them and that adoption is taking.

And if the change is small enough, maybe it could make sense to backport
it to stable versions to fix boot issues. With a testable feature it
does make sense.

Willy

^ permalink raw reply

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
From: Florian Weimer @ 2019-09-21  6:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Lutomirski, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Willy Tarreau, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man
In-Reply-To: <CAHk-=wjpTWgpo6d24pTv+ubfea_uEomX-sHjjOkdACfV-8Nmkg@mail.gmail.com>

* Linus Torvalds:

> Violently agreed. And that's kind of what the GRND_EXPLICIT is really
> aiming for.
>
> However, it's worth noting that nobody should ever use GRND_EXPLICIT
> directly. That's just the name for the bit. The actual users would use
> GRND_INSECURE or GRND_SECURE.

Should we switch glibc's getentropy to GRND_EXPLICIT?  Or something
else?

I don't think we want to print a kernel warning for this function.

Thanks,
Florian

^ permalink raw reply

* For review: pidfd_open(2) manual page
From: Michael Kerrisk (man-pages) @ 2019-09-23  9:11 UTC (permalink / raw)
  To: Christian Brauner, Jann Horn, Daniel Colascione,
	Eric W. Biederman, Joel Fernandes
  Cc: mtk.manpages, Linux API, lkml, linux-man, Oleg Nesterov

Hello Christian and all,

Below, I have the rendered version of the current draft of
the pidfd_open(2) manual page that I have written.
The page source can be found in a Git branch at:
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd

I would be pleased to receive corrections and notes on any
details that should be added. (For example, are there error
cases that I have missed?)

Would you be able to review please?

Thanks,

Michael


NAME
       pidfd_open - obtain a file descriptor that refers to a process

SYNOPSIS
       int pidfd_open(pid_t pid, unsigned int flags);

DESCRIPTION
       The  pidfd_open()  system creates a file descriptor that refers to
       the process whose PID is specified in pid.  The file descriptor is
       returned  as the function result; the close-on-exec flag is set on
       the file descriptor.

       The flags argument is reserved for  future  use;  currently,  this
       argument must be specified as 0.

RETURN VALUE
       On  success,  pidfd_open()  returns a nonnegative file descriptor.
       On success, -1 is returned and errno is set to indicate the  cause
       of the error.

ERRORS
       EINVAL flags is not 0.

       EINVAL pid is not valid.

       ESRCH  The process specified by pid does not exist.

VERSIONS
       pidfd_open() first appeared in Linux 5.3.

CONFORMING TO
       pidfd_open() is Linux specific.

NOTES
       Currently, there is no glibc wrapper for this system call; call it
       using syscall(2).

       The pidfd_send_signal(2) system call can be used to send a  signal
       to the process referred to by a PID file descriptor.

       A  PID  file descriptor can be monitored using poll(2), select(2),
       and epoll(7).  When the process that it refers to terminates,  the
       file descriptor indicates as readable.  Note, however, that in the
       current implementation, nothing can be read from the file descrip‐
       tor.

       The  pidfd_open()  system call is the preferred way of obtaining a
       PID file descriptor.  The alternative is to obtain a file descrip‐
       tor by opening a /proc/[pid] directory.  However, the latter tech‐
       nique is possible only if the proc(5) file system is mounted; fur‐
       thermore,  the  file  descriptor  obtained in this way is not pol‐
       lable.

       See also the discussion of the CLONE_PIDFD flag in clone(2).

EXAMPLE
       The program below opens a PID  file  descriptor  for  the  process
       whose PID is specified as its command-line argument.  It then mon‐
       itors the file descriptor for readability (POLLIN) using  poll(2).
       When  the  process  with  the specified by PID terminates, poll(2)
       returns, and indicates that the file descriptor is readable.

   Program source

       #define _GNU_SOURCE
       #include <sys/syscall.h>
       #include <unistd.h>
       #include <poll.h>
       #include <stdlib.h>
       #include <stdio.h>

       #ifndef __NR_pidfd_open
       #define __NR_pidfd_open 434
       #endif

       static
       int pidfd_open(pid_t pid, unsigned int flags)
       {
           return syscall(__NR_pidfd_open, pid, flags);
       }

       int
       main(int argc, char *argv[])
       {
           struct pollfd pollfd;
           int pidfd, ready;

           if (argc != 2) {
               fprintf(stderr, "Usage: %s <pid>\n", argv[0]);
               exit(EXIT_SUCCESS);
           }

           pidfd = pidfd_open(atoi(argv[1]), 0);
           if (pidfd == -1) {
               perror("pidfd_open");
               exit(EXIT_FAILURE);
           }

           pollfd.fd = pidfd;
           pollfd.events = POLLIN;

           ready = poll(&pollfd, 1, -1);
           if (ready == -1) {
               perror("poll");
               exit(EXIT_FAILURE);
           }

           printf("Events (0x%x): POLLIN is %sset\n", pollfd.revents,
                   (pollfd.revents & POLLIN) ? "" : "not ");

           exit(EXIT_SUCCESS);
       }

SEE ALSO
       clone(2),  kill(2),  pidfd_send_signal(2),   poll(2),   select(2),
       epoll(7)


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* For review: pidfd_send_signal(2) manual page
From: Michael Kerrisk (man-pages) @ 2019-09-23  9:12 UTC (permalink / raw)
  To: Oleg Nesterov, Christian Brauner, Jann Horn, Eric W. Biederman,
	Daniel Colascione, Joel Fernandes
  Cc: mtk.manpages, linux-man, Linux API, lkml

Hello Christian and all,

Below, I have the rendered version of the current draft of
the pidfd_send_signal(2) manual page that I have written.
The page source can be found in a Git branch at:
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd

I would be pleased to receive corrections and notes on any
details that should be added. (For example, are there error
cases that I have missed?)

Would you be able to review please?

Thanks,

Michael


NAME
       pidfd_send_signal - send a signal to a process specified by a file
       descriptor

SYNOPSIS
       int pidfd_send_signal(int pidfd, int sig, siginfo_t info,
                             unsigned int flags);

DESCRIPTION
       The pidfd_send_signal() system call sends the signal  sig  to  the
       target  process  referred  to by pidfd, a PID file descriptor that
       refers to a process.

       If the info argument points to a  siginfo_t  buffer,  that  buffer
       should be populated as described in rt_sigqueueinfo(2).

       If  the  info  argument  is  a NULL pointer, this is equivalent to
       specifying a pointer to a siginfo_t buffer whose fields match  the
       values  that  are  implicitly supplied when a signal is sent using
       kill(2):

       *  si_signo is set to the signal number;
       *  si_errno is set to 0;
       *  si_code is set to SI_USER;
       *  si_pid is set to the caller's PID; and
       *  si_uid is set to the caller's real user ID.

       The calling process must either be in the same  PID  namespace  as
       the  process  referred  to  by pidfd, or be in an ancestor of that
       namespace.

       The flags argument is reserved for  future  use;  currently,  this
       argument must be specified as 0.

RETURN VALUE
       On  success,  pidfd_send_signal()  returns  0.   On success, -1 is
       returned and errno is set to indicate the cause of the error.

ERRORS
       EBADF  pidfd is not a valid PID file descriptor.

       EINVAL sig is not a valid signal.

       EINVAL The calling process is not in a PID namespace from which it
              can send a signal to the target process.

       EINVAL flags is not 0.

       EPERM  The  calling  process  does not have permission to send the
              signal to the target process.

       EPERM  pidfd  doesn't  refer   to   the   calling   process,   and
              info.si_code is invalid (see rt_sigqueueinfo(2)).

       ESRCH  The target process does not exist.

VERSIONS
       pidfd_send_signal() first appeared in Linux 5.1.

CONFORMING TO
       pidfd_send_signal() is Linux specific.

NOTES
       Currently, there is no glibc wrapper for this system call; call it
       using syscall(2).

   PID file descriptors
       The pidfd argument is a PID file  descriptor,  a  file  descriptor
       that  refers  to  process.  Such a file descriptor can be obtained
       in any of the following ways:

       *  by opening a /proc/[pid] directory;

       *  using pidfd_open(2); or

       *  via the PID file descriptor that  is  returned  by  a  call  to
          clone(2) or clone3(2) that specifies the CLONE_PIDFD flag.

       The  pidfd_send_signal()  system call allows the avoidance of race
       conditions that occur when using traditional interfaces  (such  as
       kill(2)) to signal a process.  The problem is that the traditional
       interfaces specify the target process via a process ID (PID), with
       the  result  that the sender may accidentally send a signal to the
       wrong process if the originally intended target process has termi‐
       nated  and its PID has been recycled for another process.  By con‐
       trast, a PID file descriptor is a stable reference to  a  specific
       process;  if  that  process  terminates,  then the file descriptor
       ceases to be  valid  and  the  caller  of  pidfd_send_signal()  is
       informed of this fact via an ESRCH error.

EXAMPLE
       #define _GNU_SOURCE
       #include <limits.h>
       #include <signal.h>
       #include <fcntl.h>
       #include <stdio.h>
       #include <string.h>
       #include <stdlib.h>
       #include <unistd.h>
       #include <sys/syscall.h>

       #ifndef __NR_pidfd_send_signal
       #define __NR_pidfd_send_signal 424
       #endif

       static
       int pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
               unsigned int flags)
       {
           return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
       }

       int
       main(int argc, char *argv[])
       {
           siginfo_t info;
           char path[PATH_MAX];
           int pidfd, sig;

           if (argc != 3) {
               fprintf(stderr, "Usage: %s <pid> <signal>\n", argv[0]);
               exit(EXIT_FAILURE);
           }

           sig = atoi(argv[2]);

           /* Obtain a PID file descriptor by opening the /proc/PID directory
              of the target process */

           snprintf(path, sizeof(path), "/proc/%s", argv[1]);

           pidfd = open(path, O_RDONLY);
           if (pidfd == -1) {
               perror("open");
               exit(EXIT_FAILURE);
           }

           /* Populate a 'siginfo_t' structure for use with
              pidfd_send_signal() */

           memset(&info, 0, sizeof(info));
           info.si_code = SI_QUEUE;
           info.si_signo = sig;
           info.si_errno = 0;
           info.si_uid = getuid();
           info.si_pid = getpid();
           info.si_value.sival_int = 1234;

           /* Send the signal */

           if (pidfd_send_signal(pidfd, sig, &info, 0) == -1) {
               perror("pidfd_send_signal");
               exit(EXIT_FAILURE);
           }

           exit(EXIT_SUCCESS);
       }

SEE ALSO
       clone(2),   kill(2),   pidfd_open(2),  rt_sigqueueinfo(2),  sigac‐
       tion(2), pid_namespaces(7), signal(7)

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Florian Weimer @ 2019-09-23 10:53 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Christian Brauner, Jann Horn, Daniel Colascione,
	Eric W. Biederman, Joel Fernandes, Linux API, lkml, linux-man,
	Oleg Nesterov
In-Reply-To: <90399dee-53d8-a82c-3871-9ec8f94601ce@gmail.com>

* Michael Kerrisk:

> SYNOPSIS
>        int pidfd_open(pid_t pid, unsigned int flags);

Should this mention <sys/types.h> for pid_t?

> ERRORS
>        EINVAL flags is not 0.
>
>        EINVAL pid is not valid.
>
>        ESRCH  The process specified by pid does not exist.

Presumably, EMFILE and ENFILE are also possible errors, and so is
ENOMEM.

>        A  PID  file descriptor can be monitored using poll(2), select(2),
>        and epoll(7).  When the process that it refers to terminates,  the
>        file descriptor indicates as readable.  Note, however, that in the
>        current implementation, nothing can be read from the file descrip‐
>        tor.

“is indicated as readable” or “becomes readable”?  Will reading block?

>        The  pidfd_open()  system call is the preferred way of obtaining a
>        PID file descriptor.  The alternative is to obtain a file descrip‐
>        tor by opening a /proc/[pid] directory.  However, the latter tech‐
>        nique is possible only if the proc(5) file system is mounted; fur‐
>        thermore,  the  file  descriptor  obtained in this way is not pol‐
>        lable.

One question is whether the glibc wrapper should fall back back to the
/proc subdirectory if it is not available.  Probably not.

>        static
>        int pidfd_open(pid_t pid, unsigned int flags)
>        {
>            return syscall(__NR_pidfd_open, pid, flags);
>        }

Please call this function something else (not pidfd_open), so that the
example continues to work if glibc provides the system call wrapper.

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Daniel Colascione @ 2019-09-23 11:26 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Michael Kerrisk (man-pages), Christian Brauner, Jann Horn,
	Eric W. Biederman, Joel Fernandes, Linux API, lkml, linux-man,
	Oleg Nesterov
In-Reply-To: <87tv939td6.fsf@mid.deneb.enyo.de>

On Mon, Sep 23, 2019 at 3:53 AM Florian Weimer <fw@deneb.enyo.de> wrote:
>
> * Michael Kerrisk:
>
> > SYNOPSIS
> >        int pidfd_open(pid_t pid, unsigned int flags);
>
> Should this mention <sys/types.h> for pid_t?
>
> > ERRORS
> >        EINVAL flags is not 0.
> >
> >        EINVAL pid is not valid.
> >
> >        ESRCH  The process specified by pid does not exist.
>
> Presumably, EMFILE and ENFILE are also possible errors, and so is
> ENOMEM.
>
> >        A  PID  file descriptor can be monitored using poll(2), select(2),
> >        and epoll(7).  When the process that it refers to terminates,  the
> >        file descriptor indicates as readable.

The phrase "becomes readable" is simpler than "indicates as readable"
and conveys the same meaning. I agree with Florian's comment on this
point below.

> > Note, however, that in the
> >        current implementation, nothing can be read from the file descrip‐
> >        tor.
>
> “is indicated as readable” or “becomes readable”?  Will reading block?
>
> >        The  pidfd_open()  system call is the preferred way of obtaining a
> >        PID file descriptor.  The alternative is to obtain a file descrip‐
> >        tor by opening a /proc/[pid] directory.  However, the latter tech‐
> >        nique is possible only if the proc(5) file system is mounted; fur‐
> >        thermore,  the  file  descriptor  obtained in this way is not pol‐
> >        lable.

Referring to procfs directory FDs as pidfds will probably confuse
people. I'd just omit this paragraph.

> One question is whether the glibc wrapper should fall back back to the
> /proc subdirectory if it is not available.  Probably not.

I'd prefer that glibc not provide this kind of fallback.
posix_fallocate-style emulation is, IMHO, too surprising.

^ permalink raw reply

* Re: For review: pidfd_send_signal(2) manual page
From: Florian Weimer @ 2019-09-23 11:26 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Oleg Nesterov, Christian Brauner, Jann Horn, Eric W. Biederman,
	Daniel Colascione, Joel Fernandes, linux-man, Linux API, lkml
In-Reply-To: <f21dbd73-5ef4-fb5b-003f-ff4fec34a1de@gmail.com>

* Michael Kerrisk:

> SYNOPSIS
>        int pidfd_send_signal(int pidfd, int sig, siginfo_t info,
>                              unsigned int flags);

This probably should reference a header for siginfo_t.

>        ESRCH  The target process does not exist.

If the descriptor is valid, does this mean the process has been waited
for?  Maybe this can be made more explicit.

>        The  pidfd_send_signal()  system call allows the avoidance of race
>        conditions that occur when using traditional interfaces  (such  as
>        kill(2)) to signal a process.  The problem is that the traditional
>        interfaces specify the target process via a process ID (PID), with
>        the  result  that the sender may accidentally send a signal to the
>        wrong process if the originally intended target process has termi‐
>        nated  and its PID has been recycled for another process.  By con‐
>        trast, a PID file descriptor is a stable reference to  a  specific
>        process;  if  that  process  terminates,  then the file descriptor
>        ceases to be  valid  and  the  caller  of  pidfd_send_signal()  is
>        informed of this fact via an ESRCH error.

It would be nice to explain somewhere how you can avoid the race using
a PID descriptor.  Is there anything else besides CLONE_PIDFD?

>        static
>        int pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
>                unsigned int flags)
>        {
>            return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
>        }

Please use a different function name.  Thanks.

^ permalink raw reply

* Re: For review: pidfd_send_signal(2) manual page
From: Daniel Colascione @ 2019-09-23 11:31 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Oleg Nesterov, Christian Brauner, Jann Horn, Eric W. Biederman,
	Joel Fernandes, linux-man, Linux API, lkml
In-Reply-To: <f21dbd73-5ef4-fb5b-003f-ff4fec34a1de@gmail.com>

On Mon, Sep 23, 2019 at 2:12 AM Michael Kerrisk (man-pages)
<mtk.manpages@gmail.com> wrote:
>        The  pidfd_send_signal()  system call allows the avoidance of race
>        conditions that occur when using traditional interfaces  (such  as
>        kill(2)) to signal a process.  The problem is that the traditional
>        interfaces specify the target process via a process ID (PID), with
>        the  result  that the sender may accidentally send a signal to the
>        wrong process if the originally intended target process has termi‐
>        nated  and its PID has been recycled for another process.  By con‐
>        trast, a PID file descriptor is a stable reference to  a  specific
>        process;  if  that  process  terminates,  then the file descriptor
>        ceases to be  valid

The file *descriptor* remains valid even after the process to which it
refers exits. You can close(2) the file descriptor without getting
EBADF. I'd say, instead, that "a PID file descriptor is a stable
reference to a specific process; process-related operations on a PID
file descriptor fail after that process exits".

^ permalink raw reply

* Re: For review: pidfd_send_signal(2) manual page
From: Christian Brauner @ 2019-09-23 14:23 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Michael Kerrisk (man-pages), Oleg Nesterov, Christian Brauner,
	Jann Horn, Eric W. Biederman, Daniel Colascione, Joel Fernandes,
	linux-man, Linux API, lkml
In-Reply-To: <87pnjr9rth.fsf@mid.deneb.enyo.de>

On Mon, Sep 23, 2019 at 01:26:34PM +0200, Florian Weimer wrote:
> * Michael Kerrisk:
> 
> > SYNOPSIS
> >        int pidfd_send_signal(int pidfd, int sig, siginfo_t info,
> >                              unsigned int flags);
> 
> This probably should reference a header for siginfo_t.

Agreed.

> 
> >        ESRCH  The target process does not exist.
> 
> If the descriptor is valid, does this mean the process has been waited
> for?  Maybe this can be made more explicit.

If by valid you mean "refers to a process/thread-group leader" aka is a
pidfd then yes: Getting ESRCH means that the process has exited and has
already been waited upon.
If it had only exited but not waited upon aka is a zombie, then sending
a signal will just work because that's currently how sending signals to
zombies works, i.e. if you only send a signal and don't do any
additional checks you won't notice a difference between a process being
alive and a process being a zombie. The userspace visible behavior in
terms of signaling them is identical.

> 
> >        The  pidfd_send_signal()  system call allows the avoidance of race
> >        conditions that occur when using traditional interfaces  (such  as
> >        kill(2)) to signal a process.  The problem is that the traditional
> >        interfaces specify the target process via a process ID (PID), with
> >        the  result  that the sender may accidentally send a signal to the
> >        wrong process if the originally intended target process has termi‐
> >        nated  and its PID has been recycled for another process.  By con‐
> >        trast, a PID file descriptor is a stable reference to  a  specific
> >        process;  if  that  process  terminates,  then the file descriptor
> >        ceases to be  valid  and  the  caller  of  pidfd_send_signal()  is
> >        informed of this fact via an ESRCH error.
> 
> It would be nice to explain somewhere how you can avoid the race using
> a PID descriptor.  Is there anything else besides CLONE_PIDFD?

If you're the parent of the process you can do this without CLONE_PIDFD:
pid = fork();
pidfd = pidfd_open();
ret = pidfd_send_signal(pidfd, 0, NULL, 0);
if (ret < 0 && errno == ESRCH)
	/* pidfd refers to another, recycled process */

> 
> >        static
> >        int pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
> >                unsigned int flags)
> >        {
> >            return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
> >        }
> 
> Please use a different function name.  Thanks.

^ permalink raw reply

* Re: For review: pidfd_send_signal(2) manual page
From: Christian Brauner @ 2019-09-23 14:29 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Oleg Nesterov, Christian Brauner, Jann Horn, Eric W. Biederman,
	Daniel Colascione, Joel Fernandes, linux-man, Linux API, lkml
In-Reply-To: <f21dbd73-5ef4-fb5b-003f-ff4fec34a1de@gmail.com>

On Mon, Sep 23, 2019 at 11:12:00AM +0200, Michael Kerrisk (man-pages) wrote:
> Hello Christian and all,
> 
> Below, I have the rendered version of the current draft of
> the pidfd_send_signal(2) manual page that I have written.
> The page source can be found in a Git branch at:
> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd
> 
> I would be pleased to receive corrections and notes on any
> details that should be added. (For example, are there error
> cases that I have missed?)
> 
> Would you be able to review please?

Michael,

A big big thank you for doing this! Really appreciated.
I'm happy to review this!

> 
> Thanks,
> 
> Michael
> 
> 
> NAME
>        pidfd_send_signal - send a signal to a process specified by a file
>        descriptor
> 
> SYNOPSIS
>        int pidfd_send_signal(int pidfd, int sig, siginfo_t info,
>                              unsigned int flags);
> 
> DESCRIPTION
>        The pidfd_send_signal() system call sends the signal  sig  to  the
>        target  process  referred  to by pidfd, a PID file descriptor that
>        refers to a process.
> 
>        If the info argument points to a  siginfo_t  buffer,  that  buffer
>        should be populated as described in rt_sigqueueinfo(2).
> 
>        If  the  info  argument  is  a NULL pointer, this is equivalent to
>        specifying a pointer to a siginfo_t buffer whose fields match  the
>        values  that  are  implicitly supplied when a signal is sent using
>        kill(2):
> 
>        *  si_signo is set to the signal number;
>        *  si_errno is set to 0;
>        *  si_code is set to SI_USER;
>        *  si_pid is set to the caller's PID; and
>        *  si_uid is set to the caller's real user ID.
> 
>        The calling process must either be in the same  PID  namespace  as
>        the  process  referred  to  by pidfd, or be in an ancestor of that
>        namespace.
> 
>        The flags argument is reserved for  future  use;  currently,  this
>        argument must be specified as 0.
> 
> RETURN VALUE
>        On  success,  pidfd_send_signal()  returns  0.   On success, -1 is

This should probably be "On error, -1 is [...]".

>        returned and errno is set to indicate the cause of the error.
> 
> ERRORS
>        EBADF  pidfd is not a valid PID file descriptor.
> 
>        EINVAL sig is not a valid signal.
> 
>        EINVAL The calling process is not in a PID namespace from which it
>               can send a signal to the target process.
> 
>        EINVAL flags is not 0.
> 
>        EPERM  The  calling  process  does not have permission to send the
>               signal to the target process.
> 
>        EPERM  pidfd  doesn't  refer   to   the   calling   process,   and
>               info.si_code is invalid (see rt_sigqueueinfo(2)).
> 
>        ESRCH  The target process does not exist.
> 
> VERSIONS
>        pidfd_send_signal() first appeared in Linux 5.1.
> 
> CONFORMING TO
>        pidfd_send_signal() is Linux specific.
> 
> NOTES
>        Currently, there is no glibc wrapper for this system call; call it
>        using syscall(2).
> 
>    PID file descriptors
>        The pidfd argument is a PID file  descriptor,  a  file  descriptor
>        that  refers  to  process.  Such a file descriptor can be obtained
>        in any of the following ways:
> 
>        *  by opening a /proc/[pid] directory;
> 
>        *  using pidfd_open(2); or
> 
>        *  via the PID file descriptor that  is  returned  by  a  call  to
>           clone(2) or clone3(2) that specifies the CLONE_PIDFD flag.
> 
>        The  pidfd_send_signal()  system call allows the avoidance of race
>        conditions that occur when using traditional interfaces  (such  as
>        kill(2)) to signal a process.  The problem is that the traditional
>        interfaces specify the target process via a process ID (PID), with
>        the  result  that the sender may accidentally send a signal to the
>        wrong process if the originally intended target process has termi‐
>        nated  and its PID has been recycled for another process.  By con‐
>        trast, a PID file descriptor is a stable reference to  a  specific
>        process;  if  that  process  terminates,  then the file descriptor
>        ceases to be  valid  and  the  caller  of  pidfd_send_signal()  is
>        informed of this fact via an ESRCH error.
> 
> EXAMPLE
>        #define _GNU_SOURCE
>        #include <limits.h>
>        #include <signal.h>
>        #include <fcntl.h>
>        #include <stdio.h>
>        #include <string.h>
>        #include <stdlib.h>
>        #include <unistd.h>
>        #include <sys/syscall.h>
> 
>        #ifndef __NR_pidfd_send_signal
>        #define __NR_pidfd_send_signal 424
>        #endif
> 
>        static
>        int pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
>                unsigned int flags)
>        {
>            return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
>        }
> 
>        int
>        main(int argc, char *argv[])
>        {
>            siginfo_t info;
>            char path[PATH_MAX];
>            int pidfd, sig;
> 
>            if (argc != 3) {
>                fprintf(stderr, "Usage: %s <pid> <signal>\n", argv[0]);
>                exit(EXIT_FAILURE);
>            }
> 
>            sig = atoi(argv[2]);
> 
>            /* Obtain a PID file descriptor by opening the /proc/PID directory
>               of the target process */
> 
>            snprintf(path, sizeof(path), "/proc/%s", argv[1]);
> 
>            pidfd = open(path, O_RDONLY);
>            if (pidfd == -1) {
>                perror("open");
>                exit(EXIT_FAILURE);
>            }
> 
>            /* Populate a 'siginfo_t' structure for use with
>               pidfd_send_signal() */
> 
>            memset(&info, 0, sizeof(info));
>            info.si_code = SI_QUEUE;
>            info.si_signo = sig;
>            info.si_errno = 0;
>            info.si_uid = getuid();
>            info.si_pid = getpid();
>            info.si_value.sival_int = 1234;
> 
>            /* Send the signal */
> 
>            if (pidfd_send_signal(pidfd, sig, &info, 0) == -1) {
>                perror("pidfd_send_signal");
>                exit(EXIT_FAILURE);
>            }
> 
>            exit(EXIT_SUCCESS);
>        }
> 
> SEE ALSO
>        clone(2),   kill(2),   pidfd_open(2),  rt_sigqueueinfo(2),  sigac‐
>        tion(2), pid_namespaces(7), signal(7)
> 

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Christian Brauner @ 2019-09-23 14:38 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Christian Brauner, Jann Horn, Daniel Colascione,
	Eric W. Biederman, Joel Fernandes, Linux API, lkml, linux-man,
	Oleg Nesterov
In-Reply-To: <90399dee-53d8-a82c-3871-9ec8f94601ce@gmail.com>

On Mon, Sep 23, 2019 at 11:11:53AM +0200, Michael Kerrisk (man-pages) wrote:
> Hello Christian and all,
> 
> Below, I have the rendered version of the current draft of
> the pidfd_open(2) manual page that I have written.
> The page source can be found in a Git branch at:
> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd
> 
> I would be pleased to receive corrections and notes on any
> details that should be added. (For example, are there error
> cases that I have missed?)
> 
> Would you be able to review please?

Again, thank you Michael for doing this!

> 
> Thanks,
> 
> Michael
> 
> 
> NAME
>        pidfd_open - obtain a file descriptor that refers to a process
> 
> SYNOPSIS
>        int pidfd_open(pid_t pid, unsigned int flags);
> 
> DESCRIPTION
>        The  pidfd_open()  system creates a file descriptor that refers to

s/system/system call/

>        the process whose PID is specified in pid.  The file descriptor is
>        returned  as the function result; the close-on-exec flag is set on
>        the file descriptor.
> 
>        The flags argument is reserved for  future  use;  currently,  this
>        argument must be specified as 0.
> 
> RETURN VALUE
>        On  success,  pidfd_open()  returns a nonnegative file descriptor.
>        On success, -1 is returned and errno is set to indicate the  cause

s/On success/On error/g

>        of the error.
> 
> ERRORS
>        EINVAL flags is not 0.
> 
>        EINVAL pid is not valid.
> 
>        ESRCH  The process specified by pid does not exist.
> 
> VERSIONS
>        pidfd_open() first appeared in Linux 5.3.
> 
> CONFORMING TO
>        pidfd_open() is Linux specific.
> 
> NOTES
>        Currently, there is no glibc wrapper for this system call; call it
>        using syscall(2).
> 
>        The pidfd_send_signal(2) system call can be used to send a  signal
>        to the process referred to by a PID file descriptor.
> 
>        A  PID  file descriptor can be monitored using poll(2), select(2),
>        and epoll(7).  When the process that it refers to terminates,  the
>        file descriptor indicates as readable.  Note, however, that in the

Not a native English speaker but should this be "indicates it is
readable"?

>        current implementation, nothing can be read from the file descrip‐
>        tor.
> 
>        The  pidfd_open()  system call is the preferred way of obtaining a
>        PID file descriptor.  The alternative is to obtain a file descrip‐
>        tor by opening a /proc/[pid] directory.  However, the latter tech‐
>        nique is possible only if the proc(5) file system is mounted; fur‐
>        thermore,  the  file  descriptor  obtained in this way is not pol‐
>        lable.

I mentioned this already in the CLONE_PIDFD manpage, we should probably
not make a big deal out of this and not mention /proc/<pid> here at all.
(Crazy idea, but we could also have a config option that allows you to
turn of proc-pid-dirfds as pidfds if we start to feel really strongly
about this or a sysctl whatever...)

> 
>        See also the discussion of the CLONE_PIDFD flag in clone(2).
> 
> EXAMPLE
>        The program below opens a PID  file  descriptor  for  the  process
>        whose PID is specified as its command-line argument.  It then mon‐
>        itors the file descriptor for readability (POLLIN) using  poll(2).

Yeah, maybe say "monitors the file descriptor for process exit indicated
by an EPOLLIN event" or something. Readability might be confusing.

>        When  the  process  with  the specified by PID terminates, poll(2)
>        returns, and indicates that the file descriptor is readable.

See comment above "readable". (I'm on my phone and I think someone
pointed this out already.)

> 
>    Program source
> 
>        #define _GNU_SOURCE
>        #include <sys/syscall.h>
>        #include <unistd.h>
>        #include <poll.h>
>        #include <stdlib.h>
>        #include <stdio.h>
> 
>        #ifndef __NR_pidfd_open
>        #define __NR_pidfd_open 434
>        #endif

Alpha is special... (and not in a good way).
So you would need to special case Alpha since that's the only arch where
we haven't been able to unify syscall numbering. :D
But it's not super important.

I like the program example.

> 
>        static
>        int pidfd_open(pid_t pid, unsigned int flags)
>        {
>            return syscall(__NR_pidfd_open, pid, flags);
>        }
> 
>        int
>        main(int argc, char *argv[])
>        {
>            struct pollfd pollfd;
>            int pidfd, ready;
> 
>            if (argc != 2) {
>                fprintf(stderr, "Usage: %s <pid>\n", argv[0]);
>                exit(EXIT_SUCCESS);
>            }
> 
>            pidfd = pidfd_open(atoi(argv[1]), 0);
>            if (pidfd == -1) {
>                perror("pidfd_open");
>                exit(EXIT_FAILURE);
>            }
> 
>            pollfd.fd = pidfd;
>            pollfd.events = POLLIN;
> 
>            ready = poll(&pollfd, 1, -1);
>            if (ready == -1) {
>                perror("poll");
>                exit(EXIT_FAILURE);
>            }
> 
>            printf("Events (0x%x): POLLIN is %sset\n", pollfd.revents,
>                    (pollfd.revents & POLLIN) ? "" : "not ");
> 
>            exit(EXIT_SUCCESS);
>        }
> 
> SEE ALSO
>        clone(2),  kill(2),  pidfd_send_signal(2),   poll(2),   select(2),
>        epoll(7)
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Christian Brauner @ 2019-09-23 14:47 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Michael Kerrisk (man-pages), Christian Brauner, Jann Horn,
	Daniel Colascione, Eric W. Biederman, Joel Fernandes, Linux API,
	lkml, linux-man, Oleg Nesterov
In-Reply-To: <87tv939td6.fsf@mid.deneb.enyo.de>

On Mon, Sep 23, 2019 at 12:53:09PM +0200, Florian Weimer wrote:
> * Michael Kerrisk:
> 
> > SYNOPSIS
> >        int pidfd_open(pid_t pid, unsigned int flags);
> 
> Should this mention <sys/types.h> for pid_t?
> 
> > ERRORS
> >        EINVAL flags is not 0.
> >
> >        EINVAL pid is not valid.
> >
> >        ESRCH  The process specified by pid does not exist.
> 
> Presumably, EMFILE and ENFILE are also possible errors, and so is
> ENOMEM.

So, error codes that could surface are:
EMFILE: too many open files
ENODEV: the anon inode filesystem is not available in this kernel (unlikely)
ENOMEM: not enough memory (to allocate the backing struct file)
ENFILE: you're over the max_files limit which can be set through proc

I think that should be it.

> 
> >        A  PID  file descriptor can be monitored using poll(2), select(2),
> >        and epoll(7).  When the process that it refers to terminates,  the
> >        file descriptor indicates as readable.  Note, however, that in the
> >        current implementation, nothing can be read from the file descrip‐
> >        tor.
> 
> “is indicated as readable” or “becomes readable”?  Will reading block?
> 
> >        The  pidfd_open()  system call is the preferred way of obtaining a
> >        PID file descriptor.  The alternative is to obtain a file descrip‐
> >        tor by opening a /proc/[pid] directory.  However, the latter tech‐
> >        nique is possible only if the proc(5) file system is mounted; fur‐
> >        thermore,  the  file  descriptor  obtained in this way is not pol‐
> >        lable.
> 
> One question is whether the glibc wrapper should fall back back to the
> /proc subdirectory if it is not available.  Probably not.

No, that would not be transparent to userspace. Especially because both
fds differ in what can be done with them.

> 
> >        static
> >        int pidfd_open(pid_t pid, unsigned int flags)
> >        {
> >            return syscall(__NR_pidfd_open, pid, flags);
> >        }
> 
> Please call this function something else (not pidfd_open), so that the
> example continues to work if glibc provides the system call wrapper.

Agreed!

^ permalink raw reply

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
From: Andy Lutomirski @ 2019-09-23 18:33 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Linus Torvalds, Andy Lutomirski, Ahmed S. Darwish,
	Lennart Poettering, Theodore Y. Ts'o, Eric W. Biederman,
	Alexander E. Patrakov, Michael Kerrisk, Willy Tarreau,
	Matthew Garrett, lkml, Ext4 Developers List, Linux API, linux-man
In-Reply-To: <87blvefai7.fsf@oldenburg2.str.redhat.com>

On Fri, Sep 20, 2019 at 11:07 PM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Linus Torvalds:
>
> > Violently agreed. And that's kind of what the GRND_EXPLICIT is really
> > aiming for.
> >
> > However, it's worth noting that nobody should ever use GRND_EXPLICIT
> > directly. That's just the name for the bit. The actual users would use
> > GRND_INSECURE or GRND_SECURE.
>
> Should we switch glibc's getentropy to GRND_EXPLICIT?  Or something
> else?
>
> I don't think we want to print a kernel warning for this function.
>

Contemplating this question, I think the answer is that we should just
not introduce GRND_EXPLICIT or anything like it.  glibc is going to
have to do *something*, and getentropy() is unlikely to just go away.
The explicitly documented semantics are that it blocks if the RNG
isn't seeded.

Similarly, FreeBSD has getrandom():

https://www.freebsd.org/cgi/man.cgi?query=getrandom&sektion=2&manpath=freebsd-release-ports

and if we make getrandom(..., 0) warn, then we have a situation where
the *correct* (if regrettable) way to use the function on FreeBSD
causes a warning on Linux.

Let's just add GRND_INSECURE, make the blocking mode work better, and,
if we're feeling a bit more adventurous, add GRND_SECURE_BLOCKING as a
better replacement for 0, convince FreeBSD to add it too, and then
worry about deprecating 0 once we at least get some agreement from the
FreeBSD camp.

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Michael Kerrisk (man-pages) @ 2019-09-23 20:20 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mtk.manpages, Christian Brauner, Jann Horn, Daniel Colascione,
	Eric W. Biederman, Joel Fernandes, Linux API, lkml, linux-man,
	Oleg Nesterov
In-Reply-To: <87tv939td6.fsf@mid.deneb.enyo.de>

Hello Florian,

Thanks for taking a look at this page.

On 9/23/19 12:53 PM, Florian Weimer wrote:
> * Michael Kerrisk:
> 
>> SYNOPSIS
>>        int pidfd_open(pid_t pid, unsigned int flags);
> 
> Should this mention <sys/types.h> for pid_t?

Seems reasonable. I added this.

>> ERRORS
>>        EINVAL flags is not 0.
>>
>>        EINVAL pid is not valid.
>>
>>        ESRCH  The process specified by pid does not exist.
> 
> Presumably, EMFILE and ENFILE are also possible errors, and so is
> ENOMEM.

Thanks. I've added those.

>>        A  PID  file descriptor can be monitored using poll(2), select(2),
>>        and epoll(7).  When the process that it refers to terminates,  the
>>        file descriptor indicates as readable.  Note, however, that in the
>>        current implementation, nothing can be read from the file descrip‐
>>        tor.
> 
> “is indicated as readable” or “becomes readable”?  Will reading block?

It won't block. Reads from a pidfd always fail with the error EINVAL
(regardless of whether the target process has terminated).

I specifically wanted to avoid "becomes readable" to avoid any
suggestion that read() does something for a pidfd. I thought 
"indicates as readable" was fine, but you, Christian and Joel 
all called this wording out, so I changed this to:

"When the process that it refers to terminates,
these interfaces indicate the file descriptor as readable."

>>        The  pidfd_open()  system call is the preferred way of obtaining a
>>        PID file descriptor.  The alternative is to obtain a file descrip‐
>>        tor by opening a /proc/[pid] directory.  However, the latter tech‐
>>        nique is possible only if the proc(5) file system is mounted; fur‐
>>        thermore,  the  file  descriptor  obtained in this way is not pol‐
>>        lable.
> 
> One question is whether the glibc wrapper should fall back back to the
> /proc subdirectory if it is not available.  Probably not.

No, since the FD returned by opening /proc/PID is less functional
(it is not pollable) than the one returned by pidfd_open().

>>        static
>>        int pidfd_open(pid_t pid, unsigned int flags)
>>        {
>>            return syscall(__NR_pidfd_open, pid, flags);
>>        }
> 
> Please call this function something else (not pidfd_open), so that the
> example continues to work if glibc provides the system call wrapper.

I figured that if the syscall does get added to glibc, then I would
modify the example. In the meantime, this does seem the most natural
way of doing things, since the example then uses the real syscall
name as it would be used if there were a wrapper function.
 
But, this leads to the question: what do you think the likelihood
is that this system call will land in glibc?

Thanks for your feedback, Florian. I've pushed various changes
to the Git branch at 
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Michael Kerrisk (man-pages) @ 2019-09-23 20:21 UTC (permalink / raw)
  To: Christian Brauner
  Cc: mtk.manpages, Christian Brauner, Jann Horn, Daniel Colascione,
	Eric W. Biederman, Joel Fernandes, Linux API, lkml, linux-man,
	Oleg Nesterov
In-Reply-To: <20190923143846.u7miwgmszecankof@wittgenstein>

Hello Christian,

On 9/23/19 4:38 PM, Christian Brauner wrote:
> On Mon, Sep 23, 2019 at 11:11:53AM +0200, Michael Kerrisk (man-pages) wrote:
>> Hello Christian and all,
>>
>> Below, I have the rendered version of the current draft of
>> the pidfd_open(2) manual page that I have written.
>> The page source can be found in a Git branch at:
>> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd
>>
>> I would be pleased to receive corrections and notes on any
>> details that should be added. (For example, are there error
>> cases that I have missed?)
>>
>> Would you be able to review please?
> 
> Again, thank you Michael for doing this!
> 
>>
>> Thanks,
>>
>> Michael
>>
>>
>> NAME
>>        pidfd_open - obtain a file descriptor that refers to a process
>>
>> SYNOPSIS
>>        int pidfd_open(pid_t pid, unsigned int flags);
>>
>> DESCRIPTION
>>        The  pidfd_open()  system creates a file descriptor that refers to
> 
> s/system/system call/

Fixed.

>>        the process whose PID is specified in pid.  The file descriptor is
>>        returned  as the function result; the close-on-exec flag is set on
>>        the file descriptor.
>>
>>        The flags argument is reserved for  future  use;  currently,  this
>>        argument must be specified as 0.
>>
>> RETURN VALUE
>>        On  success,  pidfd_open()  returns a nonnegative file descriptor.
>>        On success, -1 is returned and errno is set to indicate the  cause
> 
> s/On success/On error/g

Fixed.

>>        of the error.
>>
>> ERRORS
>>        EINVAL flags is not 0.
>>
>>        EINVAL pid is not valid.
>>
>>        ESRCH  The process specified by pid does not exist.
>>
>> VERSIONS
>>        pidfd_open() first appeared in Linux 5.3.
>>
>> CONFORMING TO
>>        pidfd_open() is Linux specific.
>>
>> NOTES
>>        Currently, there is no glibc wrapper for this system call; call it
>>        using syscall(2).
>>
>>        The pidfd_send_signal(2) system call can be used to send a  signal
>>        to the process referred to by a PID file descriptor.
>>
>>        A  PID  file descriptor can be monitored using poll(2), select(2),
>>        and epoll(7).  When the process that it refers to terminates,  the
>>        file descriptor indicates as readable.  Note, however, that in the
> 
> Not a native English speaker but should this be "indicates it is
> readable"?

See my reply to Florian.

>>        current implementation, nothing can be read from the file descrip‐
>>        tor.
>>
>>        The  pidfd_open()  system call is the preferred way of obtaining a
>>        PID file descriptor.  The alternative is to obtain a file descrip‐
>>        tor by opening a /proc/[pid] directory.  However, the latter tech‐
>>        nique is possible only if the proc(5) file system is mounted; fur‐
>>        thermore,  the  file  descriptor  obtained in this way is not pol‐
>>        lable.
> 
> I mentioned this already in the CLONE_PIDFD manpage, we should probably
> not make a big deal out of this and not mention /proc/<pid> here at all.

The thing is, people *will* learn about these two different types
of FDs, whether we document them or not. So, I think it's better to
be up front about what's available, and make a suitably strong
recommendation about the preferred technique.

Reading between the lines, it sounds like just a couple of releases
after it was implemented, you're saying that implementing
open(/proc/PID) was a mistake?

> (Crazy idea, but we could also have a config option that allows you to
> turn of proc-pid-dirfds as pidfds if we start to feel really strongly
> about this or a sysctl whatever...)
> 
>>
>>        See also the discussion of the CLONE_PIDFD flag in clone(2).
>>
>> EXAMPLE
>>        The program below opens a PID  file  descriptor  for  the  process
>>        whose PID is specified as its command-line argument.  It then mon‐
>>        itors the file descriptor for readability (POLLIN) using  poll(2).
> 
> Yeah, maybe say "monitors the file descriptor for process exit indicated
> by an EPOLLIN event" or something. Readability might be confusing.

I like that suggestion! I reworded to something close to what you suggest.

>>        When  the  process  with  the specified by PID terminates, poll(2)
>>        returns, and indicates that the file descriptor is readable.
> 
> See comment above "readable". (I'm on my phone and I think someone
> pointed this out already.)

Actually, I think I can just remove that sentence. It doesn't really
add much.

>>    Program source
>>
>>        #define _GNU_SOURCE
>>        #include <sys/syscall.h>
>>        #include <unistd.h>
>>        #include <poll.h>
>>        #include <stdlib.h>
>>        #include <stdio.h>
>>
>>        #ifndef __NR_pidfd_open
>>        #define __NR_pidfd_open 434
>>        #endif
> 
> Alpha is special... (and not in a good way).
> So you would need to special case Alpha since that's the only arch where
> we haven't been able to unify syscall numbering. :D
> But it's not super important.

Okay.

> I like the program example.

Good.

Thanks for reviewing! I've pushed various changes
to the Git branch at 
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd


Cheers,

Michael

>>
>>        static
>>        int pidfd_open(pid_t pid, unsigned int flags)
>>        {
>>            return syscall(__NR_pidfd_open, pid, flags);
>>        }
>>
>>        int
>>        main(int argc, char *argv[])
>>        {
>>            struct pollfd pollfd;
>>            int pidfd, ready;
>>
>>            if (argc != 2) {
>>                fprintf(stderr, "Usage: %s <pid>\n", argv[0]);
>>                exit(EXIT_SUCCESS);
>>            }
>>
>>            pidfd = pidfd_open(atoi(argv[1]), 0);
>>            if (pidfd == -1) {
>>                perror("pidfd_open");
>>                exit(EXIT_FAILURE);
>>            }
>>
>>            pollfd.fd = pidfd;
>>            pollfd.events = POLLIN;
>>
>>            ready = poll(&pollfd, 1, -1);
>>            if (ready == -1) {
>>                perror("poll");
>>                exit(EXIT_FAILURE);
>>            }
>>
>>            printf("Events (0x%x): POLLIN is %sset\n", pollfd.revents,
>>                    (pollfd.revents & POLLIN) ? "" : "not ");
>>
>>            exit(EXIT_SUCCESS);
>>        }
>>
>> SEE ALSO
>>        clone(2),  kill(2),  pidfd_send_signal(2),   poll(2),   select(2),
>>        epoll(7)
>>
>>
>> -- 
>> Michael Kerrisk
>> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>> Linux/UNIX System Programming Training: http://man7.org/training/
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Michael Kerrisk (man-pages) @ 2019-09-23 20:22 UTC (permalink / raw)
  To: Christian Brauner, Florian Weimer
  Cc: mtk.manpages, Christian Brauner, Jann Horn, Daniel Colascione,
	Eric W. Biederman, Joel Fernandes, Linux API, lkml, linux-man,
	Oleg Nesterov
In-Reply-To: <20190923144711.ssbrg6bdquhewo7q@wittgenstein>

Hello Christian,

On 9/23/19 4:47 PM, Christian Brauner wrote:
> On Mon, Sep 23, 2019 at 12:53:09PM +0200, Florian Weimer wrote:
>> * Michael Kerrisk:
>>
>>> SYNOPSIS
>>>        int pidfd_open(pid_t pid, unsigned int flags);
>>
>> Should this mention <sys/types.h> for pid_t?
>>
>>> ERRORS
>>>        EINVAL flags is not 0.
>>>
>>>        EINVAL pid is not valid.
>>>
>>>        ESRCH  The process specified by pid does not exist.
>>
>> Presumably, EMFILE and ENFILE are also possible errors, and so is
>> ENOMEM.
> 
> So, error codes that could surface are:
> EMFILE: too many open files
> ENODEV: the anon inode filesystem is not available in this kernel (unlikely)
> ENOMEM: not enough memory (to allocate the backing struct file)
> ENFILE: you're over the max_files limit which can be set through proc
> 
> I think that should be it.

Thanks. I've added those.
>>>        A  PID  file descriptor can be monitored using poll(2), select(2),
>>>        and epoll(7).  When the process that it refers to terminates,  the
>>>        file descriptor indicates as readable.  Note, however, that in the
>>>        current implementation, nothing can be read from the file descrip‐
>>>        tor.
>>
>> “is indicated as readable” or “becomes readable”?  Will reading block?
>>
>>>        The  pidfd_open()  system call is the preferred way of obtaining a
>>>        PID file descriptor.  The alternative is to obtain a file descrip‐
>>>        tor by opening a /proc/[pid] directory.  However, the latter tech‐
>>>        nique is possible only if the proc(5) file system is mounted; fur‐
>>>        thermore,  the  file  descriptor  obtained in this way is not pol‐
>>>        lable.
>>
>> One question is whether the glibc wrapper should fall back back to the
>> /proc subdirectory if it is not available.  Probably not.
> 
> No, that would not be transparent to userspace. Especially because both
> fds differ in what can be done with them.
> 
>>
>>>        static
>>>        int pidfd_open(pid_t pid, unsigned int flags)
>>>        {
>>>            return syscall(__NR_pidfd_open, pid, flags);
>>>        }
>>
>> Please call this function something else (not pidfd_open), so that the
>> example continues to work if glibc provides the system call wrapper.
> 
> Agreed!

See my reply to Florian. (So far, I didn't change anything here.)

Thanks,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Michael Kerrisk (man-pages) @ 2019-09-23 20:22 UTC (permalink / raw)
  To: Daniel Colascione, Florian Weimer
  Cc: mtk.manpages, Christian Brauner, Jann Horn, Eric W. Biederman,
	Joel Fernandes, Linux API, lkml, linux-man, Oleg Nesterov
In-Reply-To: <CAKOZuetTgKjgWZpCaBz8q662MwVQ-UhrV4oWFqKEWr35mQTFLw@mail.gmail.com>

Hello Daniel,

Than you for reviewing the page!

On 9/23/19 1:26 PM, Daniel Colascione wrote:
> On Mon, Sep 23, 2019 at 3:53 AM Florian Weimer <fw@deneb.enyo.de> wrote:
>>
>> * Michael Kerrisk:
>>
>>> SYNOPSIS
>>>        int pidfd_open(pid_t pid, unsigned int flags);
>>
>> Should this mention <sys/types.h> for pid_t?
>>
>>> ERRORS
>>>        EINVAL flags is not 0.
>>>
>>>        EINVAL pid is not valid.
>>>
>>>        ESRCH  The process specified by pid does not exist.
>>
>> Presumably, EMFILE and ENFILE are also possible errors, and so is
>> ENOMEM.
>>
>>>        A  PID  file descriptor can be monitored using poll(2), select(2),
>>>        and epoll(7).  When the process that it refers to terminates,  the
>>>        file descriptor indicates as readable.
> 
> The phrase "becomes readable" is simpler than "indicates as readable"
> and conveys the same meaning. I agree with Florian's comment on this
> point below.

See my reply to Florian. (I did change the text here.)

>>> Note, however, that in the
>>>        current implementation, nothing can be read from the file descrip‐
>>>        tor.
>>
>> “is indicated as readable” or “becomes readable”?  Will reading block?
>>
>>>        The  pidfd_open()  system call is the preferred way of obtaining a
>>>        PID file descriptor.  The alternative is to obtain a file descrip‐
>>>        tor by opening a /proc/[pid] directory.  However, the latter tech‐
>>>        nique is possible only if the proc(5) file system is mounted; fur‐
>>>        thermore,  the  file  descriptor  obtained in this way is not pol‐
>>>        lable.
> 
> Referring to procfs directory FDs as pidfds will probably confuse
> people. I'd just omit this paragraph.

See my reply to Christian (and feel free to argue the point, please).
So far, I have made no change here.

>> One question is whether the glibc wrapper should fall back back to the
>> /proc subdirectory if it is not available.  Probably not.
> 
> I'd prefer that glibc not provide this kind of fallback.
> posix_fallocate-style emulation is, IMHO, too surprising.

Agreed.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* Re: For review: pidfd_send_signal(2) manual page
From: Michael Kerrisk (man-pages) @ 2019-09-23 20:27 UTC (permalink / raw)
  To: Christian Brauner
  Cc: mtk.manpages, Oleg Nesterov, Christian Brauner, Jann Horn,
	Eric W. Biederman, Daniel Colascione, Joel Fernandes, linux-man,
	Linux API, lkml
In-Reply-To: <20190923142932.2gujbddnzyp4ujeu@wittgenstein>

Hello Christian,

On 9/23/19 4:29 PM, Christian Brauner wrote:
> On Mon, Sep 23, 2019 at 11:12:00AM +0200, Michael Kerrisk (man-pages) wrote:
>> Hello Christian and all,
>>
>> Below, I have the rendered version of the current draft of
>> the pidfd_send_signal(2) manual page that I have written.
>> The page source can be found in a Git branch at:
>> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd
>>
>> I would be pleased to receive corrections and notes on any
>> details that should be added. (For example, are there error
>> cases that I have missed?)
>>
>> Would you be able to review please?
> 
> Michael,
> 
> A big big thank you for doing this! Really appreciated.
> I'm happy to review this!
> 
>>
>> Thanks,
>>
>> Michael
>>
>>
>> NAME
>>        pidfd_send_signal - send a signal to a process specified by a file
>>        descriptor
>>
>> SYNOPSIS
>>        int pidfd_send_signal(int pidfd, int sig, siginfo_t info,
>>                              unsigned int flags);
>>
>> DESCRIPTION
>>        The pidfd_send_signal() system call sends the signal  sig  to  the
>>        target  process  referred  to by pidfd, a PID file descriptor that
>>        refers to a process.
>>
>>        If the info argument points to a  siginfo_t  buffer,  that  buffer
>>        should be populated as described in rt_sigqueueinfo(2).
>>
>>        If  the  info  argument  is  a NULL pointer, this is equivalent to
>>        specifying a pointer to a siginfo_t buffer whose fields match  the
>>        values  that  are  implicitly supplied when a signal is sent using
>>        kill(2):
>>
>>        *  si_signo is set to the signal number;
>>        *  si_errno is set to 0;
>>        *  si_code is set to SI_USER;
>>        *  si_pid is set to the caller's PID; and
>>        *  si_uid is set to the caller's real user ID.
>>
>>        The calling process must either be in the same  PID  namespace  as
>>        the  process  referred  to  by pidfd, or be in an ancestor of that
>>        namespace.
>>
>>        The flags argument is reserved for  future  use;  currently,  this
>>        argument must be specified as 0.
>>
>> RETURN VALUE
>>        On  success,  pidfd_send_signal()  returns  0.   On success, -1 is
> 
> This should probably be "On error, -1 is [...]".

Thanks. Fixed.


Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Florian Weimer @ 2019-09-23 20:41 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Christian Brauner, Jann Horn, Daniel Colascione,
	Eric W. Biederman, Joel Fernandes, Linux API, lkml, linux-man,
	Oleg Nesterov
In-Reply-To: <63566f1f-667d-50ca-ae85-784924d09af4@gmail.com>

* Michael Kerrisk:

>>>        static
>>>        int pidfd_open(pid_t pid, unsigned int flags)
>>>        {
>>>            return syscall(__NR_pidfd_open, pid, flags);
>>>        }
>> 
>> Please call this function something else (not pidfd_open), so that the
>> example continues to work if glibc provides the system call wrapper.
>
> I figured that if the syscall does get added to glibc, then I would
> modify the example. In the meantime, this does seem the most natural
> way of doing things, since the example then uses the real syscall
> name as it would be used if there were a wrapper function.

The problem is that programs do this as well, so they fail to build
once they are built on a newer glibc version.

> But, this leads to the question: what do you think the likelihood
> is that this system call will land in glibc?

Quite likely.  It's easy enough to document, there are no P&C issues,
and it doesn't need any new types.

pidfd_send_signal is slightly more difficult because we probably need
to add rt_sigqueueinfo first, for consistency.

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Michael Kerrisk (man-pages) @ 2019-09-23 20:57 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mtk.manpages, Christian Brauner, Jann Horn, Daniel Colascione,
	Eric W. Biederman, Joel Fernandes, Linux API, lkml, linux-man,
	Oleg Nesterov
In-Reply-To: <874l12924w.fsf@mid.deneb.enyo.de>

Hello Florian,

On 9/23/19 10:41 PM, Florian Weimer wrote:
> * Michael Kerrisk:
> 
>>>>        static
>>>>        int pidfd_open(pid_t pid, unsigned int flags)
>>>>        {
>>>>            return syscall(__NR_pidfd_open, pid, flags);
>>>>        }
>>>
>>> Please call this function something else (not pidfd_open), so that the
>>> example continues to work if glibc provides the system call wrapper.
>>
>> I figured that if the syscall does get added to glibc, then I would
>> modify the example. In the meantime, this does seem the most natural
>> way of doing things, since the example then uses the real syscall
>> name as it would be used if there were a wrapper function.
> 
> The problem is that programs do this as well, so they fail to build
> once they are built on a newer glibc version.

But isn't such a failure a good thing? I mean: it encourages
people to rid their programs of uses of syscall(2).

>> But, this leads to the question: what do you think the likelihood
>> is that this system call will land in glibc?
> 
> Quite likely.  It's easy enough to document, there are no P&C issues,
> and it doesn't need any new types.

Okay.

> pidfd_send_signal is slightly more difficult because we probably need
> to add rt_sigqueueinfo first, for consistency.

Okay. I see that's a little more problematic.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* Re: For review: pidfd_send_signal(2) manual page
From: Eric W. Biederman @ 2019-09-23 21:27 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Oleg Nesterov, Christian Brauner, Jann Horn, Daniel Colascione,
	Joel Fernandes, linux-man, Linux API, lkml
In-Reply-To: <f21dbd73-5ef4-fb5b-003f-ff4fec34a1de@gmail.com>

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> Hello Christian and all,
>
> Below, I have the rendered version of the current draft of
> the pidfd_send_signal(2) manual page that I have written.
> The page source can be found in a Git branch at:
> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd
>
> I would be pleased to receive corrections and notes on any
> details that should be added. (For example, are there error
> cases that I have missed?)
>
> Would you be able to review please?
>
> Thanks,
>
> Michael
>
>
> NAME
>        pidfd_send_signal - send a signal to a process specified by a file
>        descriptor
>
> SYNOPSIS
>        int pidfd_send_signal(int pidfd, int sig, siginfo_t info,

 This needs to be "siginfo_t *info," -----------------------^

>                              unsigned int flags);
>

Eric

^ permalink raw reply

* Re: For review: pidfd_open(2) manual page
From: Christian Brauner @ 2019-09-24  7:38 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Michael Kerrisk (man-pages), Jann Horn, Daniel Colascione,
	Eric W. Biederman, Joel Fernandes, Linux API, lkml, linux-man,
	Oleg Nesterov
In-Reply-To: <874l12924w.fsf@mid.deneb.enyo.de>

On Mon, Sep 23, 2019 at 10:41:19PM +0200, Florian Weimer wrote:
> * Michael Kerrisk:
> 
> >>>        static
> >>>        int pidfd_open(pid_t pid, unsigned int flags)
> >>>        {
> >>>            return syscall(__NR_pidfd_open, pid, flags);
> >>>        }
> >> 
> >> Please call this function something else (not pidfd_open), so that the
> >> example continues to work if glibc provides the system call wrapper.
> >
> > I figured that if the syscall does get added to glibc, then I would
> > modify the example. In the meantime, this does seem the most natural
> > way of doing things, since the example then uses the real syscall
> > name as it would be used if there were a wrapper function.
> 
> The problem is that programs do this as well, so they fail to build
> once they are built on a newer glibc version.
> 
> > But, this leads to the question: what do you think the likelihood
> > is that this system call will land in glibc?
> 
> Quite likely.  It's easy enough to document, there are no P&C issues,
> and it doesn't need any new types.

My previous mail probably didn't make it so here it is again: I think
especially with the recently established glibc consensus to provide
wrappers for all new system calls (with some sensible exceptions) I'd
expect this to be the case.

> 
> pidfd_send_signal is slightly more difficult because we probably need
> to add rt_sigqueueinfo first, for consistency.

Oh, huh. Somehow I thought we already provide that.

Christian

^ permalink raw reply

* Re: [RFC PATCH 2/3] fs: add RWF_ENCODED for writing compressed data
From: Omar Sandoval @ 2019-09-24 17:15 UTC (permalink / raw)
  To: Jann Horn
  Cc: Jens Axboe, linux-fsdevel, linux-btrfs, Dave Chinner, Linux API,
	Kernel Team, Andy Lutomirski
In-Reply-To: <CAG48ez2GKv15Uj6Wzv0sG5v2bXyrSaCtRTw5Ok_ovja_CiO_fQ@mail.gmail.com>

On Thu, Sep 19, 2019 at 05:44:12PM +0200, Jann Horn wrote:
> On Thu, Sep 19, 2019 at 8:54 AM Omar Sandoval <osandov@osandov.com> wrote:
> > Btrfs can transparently compress data written by the user. However, we'd
> > like to add an interface to write pre-compressed data directly to the
> > filesystem. This adds support for so-called "encoded writes" via
> > pwritev2().
> >
> > A new RWF_ENCODED flags indicates that a write is "encoded". If this
> > flag is set, iov[0].iov_base points to a struct encoded_iov which
> > contains metadata about the write: namely, the compression algorithm and
> > the unencoded (i.e., decompressed) length of the extent. iov[0].iov_len
> > must be set to sizeof(struct encoded_iov), which can be used to extend
> > the interface in the future. The remaining iovecs contain the encoded
> > extent.
> >
> > A similar interface for reading encoded data can be added to preadv2()
> > in the future.
> >
> > Filesystems must indicate that they support encoded writes by setting
> > FMODE_ENCODED_IO in ->file_open().
> [...]
> > +int import_encoded_write(struct kiocb *iocb, struct encoded_iov *encoded,
> > +                        struct iov_iter *from)
> > +{
> > +       if (iov_iter_single_seg_count(from) != sizeof(*encoded))
> > +               return -EINVAL;
> > +       if (copy_from_iter(encoded, sizeof(*encoded), from) != sizeof(*encoded))
> > +               return -EFAULT;
> > +       if (encoded->compression == ENCODED_IOV_COMPRESSION_NONE &&
> > +           encoded->encryption == ENCODED_IOV_ENCRYPTION_NONE) {
> > +               iocb->ki_flags &= ~IOCB_ENCODED;
> > +               return 0;
> > +       }
> > +       if (encoded->compression > ENCODED_IOV_COMPRESSION_TYPES ||
> > +           encoded->encryption > ENCODED_IOV_ENCRYPTION_TYPES)
> > +               return -EINVAL;
> > +       if (!capable(CAP_SYS_ADMIN))
> > +               return -EPERM;
> 
> How does this capable() check interact with io_uring? Without having
> looked at this in detail, I suspect that when an encoded write is
> requested through io_uring, the capable() check might be executed on
> something like a workqueue worker thread, which is probably running
> with a full capability set.

I discussed this more with Jens. You're right, per-IO permission checks
aren't going to work. In fully-polled mode, we never get an opportunity
to check capabilities in right context. So, this will probably require a
new open flag.

^ permalink raw reply

* Re: For review: pidfd_send_signal(2) manual page
From: Michael Kerrisk (man-pages) @ 2019-09-24 19:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: mtk.manpages, Oleg Nesterov, Christian Brauner, Jann Horn,
	Daniel Colascione, Joel Fernandes, linux-man, Linux API, lkml
In-Reply-To: <87ftkmu2i6.fsf@x220.int.ebiederm.org>

On 9/23/19 11:27 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> 
>> Hello Christian and all,
>>
>> Below, I have the rendered version of the current draft of
>> the pidfd_send_signal(2) manual page that I have written.
>> The page source can be found in a Git branch at:
>> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=draft_pidfd
>>
>> I would be pleased to receive corrections and notes on any
>> details that should be added. (For example, are there error
>> cases that I have missed?)
>>
>> Would you be able to review please?
>>
>> Thanks,
>>
>> Michael
>>
>>
>> NAME
>>        pidfd_send_signal - send a signal to a process specified by a file
>>        descriptor
>>
>> SYNOPSIS
>>        int pidfd_send_signal(int pidfd, int sig, siginfo_t info,
> 
>  This needs to be "siginfo_t *info," -----------------------^

Thanks, Eric. Fixed.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* Re: [RFC PATCH 2/3] fs: add RWF_ENCODED for writing compressed data
From: Omar Sandoval @ 2019-09-24 19:35 UTC (permalink / raw)
  To: Jann Horn
  Cc: Jens Axboe, linux-fsdevel, linux-btrfs, Dave Chinner, Linux API,
	Kernel Team, Andy Lutomirski
In-Reply-To: <20190924171513.GA39872@vader>

On Tue, Sep 24, 2019 at 10:15:13AM -0700, Omar Sandoval wrote:
> On Thu, Sep 19, 2019 at 05:44:12PM +0200, Jann Horn wrote:
> > On Thu, Sep 19, 2019 at 8:54 AM Omar Sandoval <osandov@osandov.com> wrote:
> > > Btrfs can transparently compress data written by the user. However, we'd
> > > like to add an interface to write pre-compressed data directly to the
> > > filesystem. This adds support for so-called "encoded writes" via
> > > pwritev2().
> > >
> > > A new RWF_ENCODED flags indicates that a write is "encoded". If this
> > > flag is set, iov[0].iov_base points to a struct encoded_iov which
> > > contains metadata about the write: namely, the compression algorithm and
> > > the unencoded (i.e., decompressed) length of the extent. iov[0].iov_len
> > > must be set to sizeof(struct encoded_iov), which can be used to extend
> > > the interface in the future. The remaining iovecs contain the encoded
> > > extent.
> > >
> > > A similar interface for reading encoded data can be added to preadv2()
> > > in the future.
> > >
> > > Filesystems must indicate that they support encoded writes by setting
> > > FMODE_ENCODED_IO in ->file_open().
> > [...]
> > > +int import_encoded_write(struct kiocb *iocb, struct encoded_iov *encoded,
> > > +                        struct iov_iter *from)
> > > +{
> > > +       if (iov_iter_single_seg_count(from) != sizeof(*encoded))
> > > +               return -EINVAL;
> > > +       if (copy_from_iter(encoded, sizeof(*encoded), from) != sizeof(*encoded))
> > > +               return -EFAULT;
> > > +       if (encoded->compression == ENCODED_IOV_COMPRESSION_NONE &&
> > > +           encoded->encryption == ENCODED_IOV_ENCRYPTION_NONE) {
> > > +               iocb->ki_flags &= ~IOCB_ENCODED;
> > > +               return 0;
> > > +       }
> > > +       if (encoded->compression > ENCODED_IOV_COMPRESSION_TYPES ||
> > > +           encoded->encryption > ENCODED_IOV_ENCRYPTION_TYPES)
> > > +               return -EINVAL;
> > > +       if (!capable(CAP_SYS_ADMIN))
> > > +               return -EPERM;
> > 
> > How does this capable() check interact with io_uring? Without having
> > looked at this in detail, I suspect that when an encoded write is
> > requested through io_uring, the capable() check might be executed on
> > something like a workqueue worker thread, which is probably running
> > with a full capability set.
> 
> I discussed this more with Jens. You're right, per-IO permission checks
> aren't going to work. In fully-polled mode, we never get an opportunity
> to check capabilities in right context. So, this will probably require a
> new open flag.

Actually, file_ns_capable() accomplishes the same thing without a new
open flag. Changing the capable() check to file_ns_capable() in
init_user_ns should be enough.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox