rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Alice Ryhl <aliceryhl@google.com>
Cc: "Paul Moore" <paul@paul-moore.com>,
	"James Morris" <jmorris@namei.org>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Christian Brauner" <brauner@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Wedson Almeida Filho" <wedsonaf@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Benno Lossin" <benno.lossin@proton.me>,
	"Andreas Hindborg" <a.hindborg@samsung.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Arve Hjønnevåg" <arve@android.com>,
	"Todd Kjos" <tkjos@android.com>,
	"Martijn Coenen" <maco@android.com>,
	"Joel Fernandes" <joel@joelfernandes.org>,
	"Carlos Llamas" <cmllamas@google.com>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Daniel Xu" <dxu@dxuuu.xyz>,
	"Martin Rodriguez Reboredo" <yakoyoku@gmail.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	rust-for-linux@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	"Kees Cook" <kees@kernel.org>
Subject: Re: [PATCH v10 6/8] rust: file: add `FileDescriptorReservation`
Date: Sun, 15 Sep 2024 19:39:05 +0100	[thread overview]
Message-ID: <20240915183905.GI2825852@ZenIV> (raw)
In-Reply-To: <20240915-alice-file-v10-6-88484f7a3dcf@google.com>

On Sun, Sep 15, 2024 at 02:31:32PM +0000, Alice Ryhl wrote:

> +impl Drop for FileDescriptorReservation {
> +    fn drop(&mut self) {
> +        // SAFETY: By the type invariants of this type, `self.fd` was previously returned by
> +        // `get_unused_fd_flags`. We have not yet used the fd, so it is still valid, and `current`
> +        // still refers to the same task, as this type cannot be moved across task boundaries.
> +        unsafe { bindings::put_unused_fd(self.fd) };
> +    }
> +}

FWIW, it's a bit more delicate.  The real rules for API users are

	1) anything you get from get_unused_fd_flags() (well, alloc_fd(),
internally) must be passed either to put_unused_fd() or fd_install() before
you return from syscall.  That should be done by the same thread and
all calls of put_unused_fd() or fd_install() should be paired with
some get_unused_fd_flags() in that manner (i.e. by the same thread,
within the same syscall, etc.)

	2) calling thread MUST NOT unshare descriptor table while it has
any reserved descriptors.  I.e.
	fd = get_unused_fd();
	unshare_files();
	fd_install(fd, file);
is a bug.  Reservations are discarded by that.  Getting rid of that
constraint would require tracking the sets of reserved descriptors
separately for each thread that happens to share the descriptor table.
Conceptually they *are* per-thread - the same thread that has done
reservation must either discard it or use it.  However, it's easier to
keep the "it's reserved by some thread" represented in descriptor table
itself (bit set in ->open_fds bitmap, file reference in ->fd[] array is
NULL) than try and keep track of who's reserved what.  The constraint is
basically "all reservations can stay with the old copy", i.e. "caller has
no reservations of its own to transfer into the new private copy it gets".
	It's not particularly onerous[*] and it simplifies things
quite a bit.  However, if we are documenting thing, it needs to be
put explicitly.  With respect to Rust, if you do e.g. binfmt-in-rust
support it will immediately become an issue - begin_new_exec() is calling
unshare_files(), so the example above can become an issue.

	Internally (in fs/file.c, that is) we have additional safety
rule - anything that might be given an arbitrary descriptor (e.g.
do_dup2() destination can come directly from dup2(2) argument,
file_close_fd_locked() victim can come directly from close(2) one,
etc.) must leave reserved descriptors alone.  Not an issue API users
need to watch out for, though.

[*] unsharing the descriptor table is done by
	+ close_range(2), which has no reason to allocate any descriptors
and is only called by userland.
	+ unshare(2), which has no reason to allocate any descriptors
and is only called by userland.
	+ a place in early init that call ksys_unshare() while arranging
the environment for /linuxrc from initrd image to be run.  Again, no
reserved descriptors there.
	+ coredumping thread in the beginning of do_coredump().
The caller is at the point of signal delivery, which means that it had
already left whatever syscall it might have been in.  Which means
that all reservations must have been undone by that point.
	+ execve() at the point of no return (in begin_new_exec()).
That's the only place where violation of that constraint on some later
changes is plausible.  That one needs to be watched out for.

  reply	other threads:[~2024-09-15 18:39 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-15 14:31 [PATCH v10 0/8] File abstractions needed by Rust Binder Alice Ryhl
2024-09-15 14:31 ` [PATCH v10 1/8] rust: types: add `NotThreadSafe` Alice Ryhl
2024-09-15 15:38   ` Gary Guo
2024-09-27 11:21     ` Miguel Ojeda
2024-09-24 19:45   ` Serge E. Hallyn
2024-09-25 11:06     ` Alice Ryhl
2024-09-25 13:59       ` Serge E. Hallyn
2024-09-27 10:20         ` Gary Guo
2024-09-15 14:31 ` [PATCH v10 2/8] rust: task: add `Task::current_raw` Alice Ryhl
2024-09-15 14:31 ` [PATCH v10 3/8] rust: file: add Rust abstraction for `struct file` Alice Ryhl
2024-09-15 21:51   ` Gary Guo
2024-09-15 14:31 ` [PATCH v10 4/8] rust: cred: add Rust abstraction for `struct cred` Alice Ryhl
2024-09-15 20:24   ` Kees Cook
2024-09-15 20:55     ` Alice Ryhl
2024-09-19  7:57   ` Paul Moore
2024-09-15 14:31 ` [PATCH v10 5/8] rust: security: add abstraction for secctx Alice Ryhl
2024-09-15 20:58   ` Kees Cook
2024-09-15 21:07     ` Alice Ryhl
2024-09-16 15:40       ` Casey Schaufler
2024-09-17 13:18         ` Paul Moore
2024-09-22 15:01           ` Alice Ryhl
2024-09-22 15:08         ` Alice Ryhl
2024-09-22 16:50           ` Casey Schaufler
2024-09-22 17:04             ` Alice Ryhl
2024-09-19  7:56   ` Paul Moore
2024-09-15 14:31 ` [PATCH v10 6/8] rust: file: add `FileDescriptorReservation` Alice Ryhl
2024-09-15 18:39   ` Al Viro [this message]
2024-09-15 19:34     ` Al Viro
2024-09-16  4:18       ` Al Viro
2024-09-15 20:13     ` Alice Ryhl
2024-09-15 22:01       ` Al Viro
2024-09-15 22:05         ` Al Viro
2024-09-15 14:31 ` [PATCH v10 7/8] rust: file: add `Kuid` wrapper Alice Ryhl
2024-09-15 22:02   ` Gary Guo
2024-09-23  9:13     ` Alice Ryhl
2024-09-26 16:33       ` Christian Brauner
2024-09-26 16:35       ` [PATCH] [RFC] rust: add PidNamespace wrapper Christian Brauner
2024-09-27 12:04         ` Alice Ryhl
2024-09-27 14:21           ` Christian Brauner
2024-09-27 14:58             ` Alice Ryhl
2024-10-01  9:43         ` [PATCH v2] rust: add PidNamespace Christian Brauner
2024-10-01 10:26           ` Alice Ryhl
2024-10-01 14:17             ` Christian Brauner
2024-10-01 15:45               ` Miguel Ojeda
2024-10-02 10:14                 ` Christian Brauner
2024-10-02 11:08                   ` Miguel Ojeda
2024-10-01 19:10           ` Gary Guo
2024-10-02 11:05             ` Christian Brauner
2024-09-15 14:31 ` [PATCH v10 8/8] rust: file: add abstraction for `poll_table` Alice Ryhl
2024-09-15 22:24   ` Gary Guo
2024-09-23  9:10     ` Alice Ryhl
2024-09-27  9:28 ` [PATCH v10 0/8] File abstractions needed by Rust Binder Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240915183905.GI2825852@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=a.hindborg@samsung.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=arve@android.com \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=brauner@kernel.org \
    --cc=cmllamas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dxu@dxuuu.xyz \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jmorris@namei.org \
    --cc=joel@joelfernandes.org \
    --cc=kees@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=maco@android.com \
    --cc=ojeda@kernel.org \
    --cc=paul@paul-moore.com \
    --cc=peterz@infradead.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=serge@hallyn.com \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tkjos@android.com \
    --cc=tmgross@umich.edu \
    --cc=wedsonaf@gmail.com \
    --cc=willy@infradead.org \
    --cc=yakoyoku@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).