From: Al Viro <viro@zeniv.linux.org.uk>
To: Alice Ryhl <aliceryhl@google.com>
Cc: "Paul Moore" <paul@paul-moore.com>,
"James Morris" <jmorris@namei.org>,
"Serge E. Hallyn" <serge@hallyn.com>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Christian Brauner" <brauner@kernel.org>,
"Alex Gaynor" <alex.gaynor@gmail.com>,
"Wedson Almeida Filho" <wedsonaf@gmail.com>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Gary Guo" <gary@garyguo.net>,
"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
"Benno Lossin" <benno.lossin@proton.me>,
"Andreas Hindborg" <a.hindborg@samsung.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Arve Hjønnevåg" <arve@android.com>,
"Todd Kjos" <tkjos@android.com>,
"Martijn Coenen" <maco@android.com>,
"Joel Fernandes" <joel@joelfernandes.org>,
"Carlos Llamas" <cmllamas@google.com>,
"Suren Baghdasaryan" <surenb@google.com>,
"Dan Williams" <dan.j.williams@intel.com>,
"Matthew Wilcox" <willy@infradead.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Daniel Xu" <dxu@dxuuu.xyz>,
"Martin Rodriguez Reboredo" <yakoyoku@gmail.com>,
"Trevor Gross" <tmgross@umich.edu>,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org,
rust-for-linux@vger.kernel.org, linux-fsdevel@vger.kernel.org,
"Kees Cook" <kees@kernel.org>
Subject: Re: [PATCH v10 6/8] rust: file: add `FileDescriptorReservation`
Date: Sun, 15 Sep 2024 20:34:43 +0100 [thread overview]
Message-ID: <20240915193443.GK2825852@ZenIV> (raw)
In-Reply-To: <20240915183905.GI2825852@ZenIV>
On Sun, Sep 15, 2024 at 07:39:05PM +0100, Al Viro wrote:
> 2) calling thread MUST NOT unshare descriptor table while it has
> any reserved descriptors. I.e.
> fd = get_unused_fd();
> unshare_files();
> fd_install(fd, file);
> is a bug. Reservations are discarded by that. Getting rid of that
> constraint would require tracking the sets of reserved descriptors
> separately for each thread that happens to share the descriptor table.
> Conceptually they *are* per-thread - the same thread that has done
> reservation must either discard it or use it. However, it's easier to
> keep the "it's reserved by some thread" represented in descriptor table
> itself (bit set in ->open_fds bitmap, file reference in ->fd[] array is
> NULL) than try and keep track of who's reserved what. The constraint is
> basically "all reservations can stay with the old copy", i.e. "caller has
> no reservations of its own to transfer into the new private copy it gets".
FWIW, I toyed with the idea of having reservations kept per-thread;
it is possible and it simplifies some things, but I hadn't been able to
find a way to do that without buggering syscall latency for open() et.al.
It would keep track of the set of reservations in task_struct (count,
two-element array for the first two + page pointer for spillovers,
for the rare threads that need more than two reserved simultaneously).
Representation in fdtable:
state open_fds bit value in ->fd[] array
free clear 0UL
reserved set 0UL
uncommitted set 1UL|(unsigned long)file
open set (unsigned long)file
with file lookup treating any odd value as 0 (i.e. as reserved)
fd_install() switching reserved to uncommitted *AND* separate
"commit" operation that does this:
if current->reservation_count == 0
return
if failure
for each descriptor in our reserved set
v = fdtable->fd[descriptor]
if (v) {
fdtable->fd[descriptor] = 0;
fput((struct file *)(v & ~1);
}
clear bit in fdtable->open_fds[]
else
for each descriptor in our reserved set
v = fdtable->fd[descriptor]
if (v)
fdtable->fd[descriptor] = v & ~1;
else
BUG
current->reservation_count = 0
That "commit" thing would be called on return from syscall
for userland threads and would be called explicitly in
kernel threads that have a descriptor table and work with
it.
The benefit would be that fd_install() would *NOT* have to be done
after the last possible failure exit - something that installs
a lot of files wouldn't have to play convoluted games on cleanup.
Simply returning an error would do the right thing.
Two stores into ->fd[] instead of one is not a big deal;
however, anything like task_work_add() to arrange the call
of "commit" ends up being bloody awful.
We could have it called from syscall glue directly, but
that means touching assembler for quite a few architectures,
and I hadn't gotten around to that. Can be done, but it's
not a pleasant work...
next prev parent reply other threads:[~2024-09-15 19:35 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-15 14:31 [PATCH v10 0/8] File abstractions needed by Rust Binder Alice Ryhl
2024-09-15 14:31 ` [PATCH v10 1/8] rust: types: add `NotThreadSafe` Alice Ryhl
2024-09-15 15:38 ` Gary Guo
2024-09-27 11:21 ` Miguel Ojeda
2024-09-24 19:45 ` Serge E. Hallyn
2024-09-25 11:06 ` Alice Ryhl
2024-09-25 13:59 ` Serge E. Hallyn
2024-09-27 10:20 ` Gary Guo
2024-09-15 14:31 ` [PATCH v10 2/8] rust: task: add `Task::current_raw` Alice Ryhl
2024-09-15 14:31 ` [PATCH v10 3/8] rust: file: add Rust abstraction for `struct file` Alice Ryhl
2024-09-15 21:51 ` Gary Guo
2024-09-15 14:31 ` [PATCH v10 4/8] rust: cred: add Rust abstraction for `struct cred` Alice Ryhl
2024-09-15 20:24 ` Kees Cook
2024-09-15 20:55 ` Alice Ryhl
2024-09-19 7:57 ` Paul Moore
2024-09-15 14:31 ` [PATCH v10 5/8] rust: security: add abstraction for secctx Alice Ryhl
2024-09-15 20:58 ` Kees Cook
2024-09-15 21:07 ` Alice Ryhl
2024-09-16 15:40 ` Casey Schaufler
2024-09-17 13:18 ` Paul Moore
2024-09-22 15:01 ` Alice Ryhl
2024-09-22 15:08 ` Alice Ryhl
2024-09-22 16:50 ` Casey Schaufler
2024-09-22 17:04 ` Alice Ryhl
2024-09-19 7:56 ` Paul Moore
2024-09-15 14:31 ` [PATCH v10 6/8] rust: file: add `FileDescriptorReservation` Alice Ryhl
2024-09-15 18:39 ` Al Viro
2024-09-15 19:34 ` Al Viro [this message]
2024-09-16 4:18 ` Al Viro
2024-09-15 20:13 ` Alice Ryhl
2024-09-15 22:01 ` Al Viro
2024-09-15 22:05 ` Al Viro
2024-09-15 14:31 ` [PATCH v10 7/8] rust: file: add `Kuid` wrapper Alice Ryhl
2024-09-15 22:02 ` Gary Guo
2024-09-23 9:13 ` Alice Ryhl
2024-09-26 16:33 ` Christian Brauner
2024-09-26 16:35 ` [PATCH] [RFC] rust: add PidNamespace wrapper Christian Brauner
2024-09-27 12:04 ` Alice Ryhl
2024-09-27 14:21 ` Christian Brauner
2024-09-27 14:58 ` Alice Ryhl
2024-10-01 9:43 ` [PATCH v2] rust: add PidNamespace Christian Brauner
2024-10-01 10:26 ` Alice Ryhl
2024-10-01 14:17 ` Christian Brauner
2024-10-01 15:45 ` Miguel Ojeda
2024-10-02 10:14 ` Christian Brauner
2024-10-02 11:08 ` Miguel Ojeda
2024-10-01 19:10 ` Gary Guo
2024-10-02 11:05 ` Christian Brauner
2024-09-15 14:31 ` [PATCH v10 8/8] rust: file: add abstraction for `poll_table` Alice Ryhl
2024-09-15 22:24 ` Gary Guo
2024-09-23 9:10 ` Alice Ryhl
2024-09-27 9:28 ` [PATCH v10 0/8] File abstractions needed by Rust Binder Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240915193443.GK2825852@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=a.hindborg@samsung.com \
--cc=alex.gaynor@gmail.com \
--cc=aliceryhl@google.com \
--cc=arve@android.com \
--cc=benno.lossin@proton.me \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=brauner@kernel.org \
--cc=cmllamas@google.com \
--cc=dan.j.williams@intel.com \
--cc=dxu@dxuuu.xyz \
--cc=gary@garyguo.net \
--cc=gregkh@linuxfoundation.org \
--cc=jmorris@namei.org \
--cc=joel@joelfernandes.org \
--cc=kees@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=maco@android.com \
--cc=ojeda@kernel.org \
--cc=paul@paul-moore.com \
--cc=peterz@infradead.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=serge@hallyn.com \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=tkjos@android.com \
--cc=tmgross@umich.edu \
--cc=wedsonaf@gmail.com \
--cc=willy@infradead.org \
--cc=yakoyoku@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).