linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Florian Weimer <fweimer@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Tycho Andersen <tycho@tycho.pizza>,
	linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, Jens Axboe <axboe@kernel.dk>
Subject: Re: [RFC 1/3] pidfd: allow pidfd_open() on non-thread-group leaders
Date: Thu, 7 Dec 2023 23:58:53 +0100	[thread overview]
Message-ID: <20231207-entdecken-selektiert-d5ce6dca6a80@brauner> (raw)
In-Reply-To: <87ttp3rprd.fsf@oldenburg.str.redhat.com>

[adjusting Cc as that's really a separate topic]

On Thu, Nov 30, 2023 at 08:43:18PM +0100, Florian Weimer wrote:
> * Mathieu Desnoyers:
> 
> >>> I'd like to offer a userspace API which allows safe stashing of
> >>> unreachable file descriptors on a service thread.

Fwiw, systemd has a concept called the fdstore:

https://systemd.io/FILE_DESCRIPTOR_STORE

"The file descriptor store [...] allows services to upload during
runtime additional fds to the service manager that it shall keep on its
behalf. File descriptors are passed back to the service on subsequent
activations, the same way as any socket activation fds are passed.

[...]

The primary use-case of this logic is to permit services to restart
seamlessly (for example to update them to a newer version), without
losing execution context, dropping pinned resources, terminating
established connections or even just momentarily losing connectivity. In
fact, as the file descriptors can be uploaded freely at any time during
the service runtime, this can even be used to implement services that
robustly handle abnormal termination and can recover from that without
losing pinned resources."

> 
> >> By "safe" here do you mean not accessible via pidfd_getfd()?
> 
> No, unreachable by close/close_range/dup2/dup3.  I expect we can do an
> intra-process transfer using /proc, but I'm hoping for something nicer.

File descriptors are reachable for all processes/threads that share a
file descriptor table. Changing that means breaking core userspace
assumptions about how file descriptors work. That's not going to happen
as far as I'm concerned.

We may consider additional security_* hooks in close*() and dup*(). That
would allow you to utilize Landlock or BPF LSM to prevent file
descriptors from being closed or duplicated. pidfd_getfd() is already
blockable via security_file_receive().

In general, messing with fds in that way is really not a good idea.

If you need something that awkward, then you should go all the way and
look at io_uring which basically has a separate fd-like handle called
"fixed files".

Fixed file indexes are separate file-descriptor like handles that can
only be used from io_uring calls but not with the regular system call
interface.

IOW, you can refer to a file using an io_uring fixed index. The index to
use can be chosen by userspace and can't be used with any regular
fd-based system calls.

The io_uring fd itself can be made a fixed file itself

The only thing missing would be to turn an io_uring fixed file back into
a regular file descriptor. That could probably be done by using
receive_fd() and then installing that fd back into the caller's file
descriptor table. But that would require an io_uring patch.

  parent reply	other threads:[~2023-12-07 22:58 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20231130163946.277502-1-tycho@tycho.pizza>
2023-12-07 17:21 ` [RFC 1/3] pidfd: allow pidfd_open() on non-thread-group leaders Christian Brauner
2023-12-07 17:52   ` Tycho Andersen
2023-12-08 17:47   ` Jan Kara
     [not found] ` <20231130173938.GA21808@redhat.com>
     [not found]   ` <ZWjM6trZ6uw6yBza@tycho.pizza>
     [not found]     ` <ZWoKbHJ0152tiGeD@tycho.pizza>
2023-12-07 17:57       ` Christian Brauner
2023-12-07 21:25         ` Christian Brauner
2023-12-08 20:04           ` Tycho Andersen
     [not found] ` <874jh3t7e9.fsf@oldenburg.str.redhat.com>
     [not found]   ` <ZWjaSAhG9KI2i9NK@tycho.pizza>
     [not found]     ` <a07b7ae6-8e86-4a87-9347-e6e1a0f2ee65@efficios.com>
     [not found]       ` <87ttp3rprd.fsf@oldenburg.str.redhat.com>
2023-12-07 22:58         ` Christian Brauner [this message]
2023-12-08  3:16           ` Jens Axboe
2023-12-08 13:15           ` Florian Weimer
2023-12-08 13:48             ` Christian Brauner
2023-12-08 13:58               ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231207-entdecken-selektiert-d5ce6dca6a80@brauner \
    --to=brauner@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=fweimer@redhat.com \
    --cc=jack@suse.cz \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=tycho@tycho.pizza \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).