From: Josh Triplett <josh@joshtriplett.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>,
io-uring@vger.kernel.org,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
lkml <linux-kernel@vger.kernel.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Arnd Bergmann <arnd@arndb.de>, Jens Axboe <axboe@kernel.dk>,
Aleksa Sarai <cyphar@cyphar.com>,
linux-man <linux-man@vger.kernel.org>,
Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH v5 2/3] fs: openat2: Extend open_how to allow userspace-selected fds
Date: Thu, 23 Apr 2020 00:33:10 -0700 [thread overview]
Message-ID: <20200423073310.GA169998@localhost> (raw)
In-Reply-To: <CAJfpeguaVYo-Lf-5Bi=EYJYWdmCfo3BqZA=kj9E5UmDb0mBc1w@mail.gmail.com>
On Thu, Apr 23, 2020 at 08:04:25AM +0200, Miklos Szeredi wrote:
> On Thu, Apr 23, 2020 at 6:42 AM Josh Triplett <josh@joshtriplett.org> wrote:
> >
> > On Thu, Apr 23, 2020 at 06:24:14AM +0200, Miklos Szeredi wrote:
> > > On Thu, Apr 23, 2020 at 2:48 AM Josh Triplett <josh@joshtriplett.org> wrote:
> > > > On Wed, Apr 22, 2020 at 09:55:56AM +0200, Miklos Szeredi wrote:
> > > > > On Wed, Apr 22, 2020 at 8:06 AM Michael Kerrisk (man-pages)
> > > > > <mtk.manpages@gmail.com> wrote:
> > > > > >
> > > > > > [CC += linux-api]
> > > > > >
> > > > > > On Wed, 22 Apr 2020 at 07:20, Josh Triplett <josh@joshtriplett.org> wrote:
> > > > > > >
> > > > > > > Inspired by the X protocol's handling of XIDs, allow userspace to select
> > > > > > > the file descriptor opened by openat2, so that it can use the resulting
> > > > > > > file descriptor in subsequent system calls without waiting for the
> > > > > > > response to openat2.
> > > > > > >
> > > > > > > In io_uring, this allows sequences like openat2/read/close without
> > > > > > > waiting for the openat2 to complete. Multiple such sequences can
> > > > > > > overlap, as long as each uses a distinct file descriptor.
> > > > >
> > > > > If this is primarily an io_uring feature, then why burden the normal
> > > > > openat2 API with this?
> > > >
> > > > This feature was inspired by io_uring; it isn't exclusively of value
> > > > with io_uring. (And io_uring doesn't normally change the semantics of
> > > > syscalls.)
> > >
> > > What's the use case of O_SPECIFIC_FD beyond io_uring?
> >
> > Avoiding a call to dup2 and close, if you need something as a specific
> > file descriptor, such as when setting up to exec something, or when
> > debugging a program.
> >
> > I don't expect it to be as widely used as with io_uring, but I also
> > don't want io_uring versions of syscalls to diverge from the underlying
> > syscalls, and this would be a heavy divergence.
>
> What are the plans for those syscalls that don't easily lend
> themselves to this modification (such as accept(2))?
accept4 has a flags argument with more flags available, so it'd be
entirely possible to cleanly extend it further without introducing a new
version. The same goes for other fd-producing syscalls that still have
flag space available.
This may or may not provide sufficient motivation on its own to
introduce a new syscall variant of a syscall that isn't currently
extensible.
> Compared to that, having a common flag for file ops to enable the use
> of fixed and private file descriptors is a clean and well contained
> interface.
"private" is not a desirable property here. See below. Even if the
userspace-specified fd mechanism were to become something only
accessible via io_uring (which I'd prefer to avoid), that's not a reason
to avoid generating real file descriptors that work anywhere a file
descriptor works.
> > > > > This would also allow Implementing a private fd table for io_uring.
> > > > > I.e. add a flag interpreted by file ops (IORING_PRIVATE_FD), including
> > > > > openat2 and freely use the private fd space without having to worry
> > > > > about interactions with other parts of the system.
> > > >
> > > > I definitely don't want to add a special kind of file descriptor that
> > > > doesn't work in normal syscalls taking file descriptors. A file
> > > > descriptor allocated via O_SPECIFIC_FD is an entirely normal file
> > > > descriptor, and works anywhere a file descriptor normally works.
> > >
> > > What's the use case of allocating a file descriptor within io_uring
> > > and using it outside of io_uring?
> >
> > Calling a syscall not provided via io_uring. Calling a library that
> > doesn't use io_uring. Passing the file descriptor via UNIX socket to
> > another program. Passing the file descriptor via exec to another
> > program. Userspace is modular, and file descriptors are widely used.
>
> I mean, you could open the file descriptor outside of io_uring in such
> cases, no?
I would prefer to not introduce that limitation in the first place, and
instead open normal file descriptors.
> The point of O_SPECIFIC_FD is to be able to perform short
> sequences of open/dosomething/close without having to block and having
> to issue separate syscalls.
"close" is not a required component. It's entirely possible to use
io_uring to open a file descriptor, do various things with it, and then
leave it open for subsequent usage via either other io_uring chains or
standalone syscalls.
> If you're going to issue separate
> syscalls anyway, then I see no point in doing the open within
> io_uring. Or?
io_uring is not an all-or-nothing proposition. There's value in using
io_uring for some operations without converting an entire program (and
every library it might potentially use on a file descriptor) entirely to
io_uring. Userspace is modular, and file descriptors are a common
element used by many different bits of userspace.
next prev parent reply other threads:[~2020-04-23 7:33 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-22 5:19 [PATCH v5 0/3] Support userspace-selected fds Josh Triplett
2020-04-22 5:19 ` [PATCH v5 1/3] fs: Support setting a minimum fd for "lowest available fd" allocation Josh Triplett
2020-04-22 6:06 ` Michael Kerrisk (man-pages)
2020-04-23 1:12 ` Dmitry V. Levin
2020-04-23 4:51 ` Josh Triplett
2020-04-23 9:24 ` Arnd Bergmann
2020-04-22 5:20 ` [PATCH v5 2/3] fs: openat2: Extend open_how to allow userspace-selected fds Josh Triplett
2020-04-22 6:06 ` Michael Kerrisk (man-pages)
2020-04-22 7:55 ` Miklos Szeredi
2020-04-23 0:48 ` Josh Triplett
2020-04-23 4:24 ` Miklos Szeredi
2020-04-23 4:42 ` Josh Triplett
2020-04-23 6:04 ` Miklos Szeredi
2020-04-23 7:33 ` Josh Triplett [this message]
2020-04-23 7:45 ` Miklos Szeredi
2020-04-23 7:57 ` Miklos Szeredi
2020-04-23 9:20 ` Miklos Szeredi
2020-04-23 9:46 ` Miklos Szeredi
2020-04-23 8:06 ` Josh Triplett
2020-04-22 5:20 ` [PATCH v5 3/3] fs: pipe2: Support O_SPECIFIC_FD Josh Triplett
2020-04-22 6:06 ` Michael Kerrisk (man-pages)
2020-04-22 15:44 ` Florian Weimer
2020-04-23 0:44 ` Josh Triplett
2020-04-22 6:05 ` [PATCH v5 0/3] Support userspace-selected fds Michael Kerrisk (man-pages)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200423073310.GA169998@localhost \
--to=josh@joshtriplett.org \
--cc=arnd@arndb.de \
--cc=axboe@kernel.dk \
--cc=cyphar@cyphar.com \
--cc=io-uring@vger.kernel.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=mtk.manpages@gmail.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.