All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Alice Ryhl <aliceryhl@google.com>
Cc: "Paul Moore" <paul@paul-moore.com>,
	"James Morris" <jmorris@namei.org>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Christian Brauner" <brauner@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Wedson Almeida Filho" <wedsonaf@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Benno Lossin" <benno.lossin@proton.me>,
	"Andreas Hindborg" <a.hindborg@samsung.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Arve Hjønnevåg" <arve@android.com>,
	"Todd Kjos" <tkjos@android.com>,
	"Martijn Coenen" <maco@android.com>,
	"Joel Fernandes" <joel@joelfernandes.org>,
	"Carlos Llamas" <cmllamas@google.com>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Daniel Xu" <dxu@dxuuu.xyz>,
	"Martin Rodriguez Reboredo" <yakoyoku@gmail.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	rust-for-linux@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	"Kees Cook" <kees@kernel.org>
Subject: Re: [PATCH v10 6/8] rust: file: add `FileDescriptorReservation`
Date: Sun, 15 Sep 2024 20:34:43 +0100	[thread overview]
Message-ID: <20240915193443.GK2825852@ZenIV> (raw)
In-Reply-To: <20240915183905.GI2825852@ZenIV>

On Sun, Sep 15, 2024 at 07:39:05PM +0100, Al Viro wrote:
> 	2) calling thread MUST NOT unshare descriptor table while it has
> any reserved descriptors.  I.e.
> 	fd = get_unused_fd();
> 	unshare_files();
> 	fd_install(fd, file);
> is a bug.  Reservations are discarded by that.  Getting rid of that
> constraint would require tracking the sets of reserved descriptors
> separately for each thread that happens to share the descriptor table.
> Conceptually they *are* per-thread - the same thread that has done
> reservation must either discard it or use it.  However, it's easier to
> keep the "it's reserved by some thread" represented in descriptor table
> itself (bit set in ->open_fds bitmap, file reference in ->fd[] array is
> NULL) than try and keep track of who's reserved what.  The constraint is
> basically "all reservations can stay with the old copy", i.e. "caller has
> no reservations of its own to transfer into the new private copy it gets".

FWIW, I toyed with the idea of having reservations kept per-thread;
it is possible and it simplifies some things, but I hadn't been able to
find a way to do that without buggering syscall latency for open() et.al.

It would keep track of the set of reservations in task_struct (count,
two-element array for the first two + page pointer for spillovers,
for the rare threads that need more than two reserved simultaneously).

Representation in fdtable:
state		open_fds bit	value in ->fd[] array
free		clear		0UL
reserved	set		0UL
uncommitted	set		1UL|(unsigned long)file
open		set		(unsigned long)file

with file lookup treating any odd value as 0 (i.e. as reserved)
fd_install() switching reserved to uncommitted *AND* separate
"commit" operation that does this:
	if current->reservation_count == 0
		return
	if failure
		for each descriptor in our reserved set
			v = fdtable->fd[descriptor]
			if (v) {
				fdtable->fd[descriptor] = 0;
				fput((struct file *)(v & ~1);
			}
			clear bit in fdtable->open_fds[]
	else
		for each descriptor in our reserved set
			v = fdtable->fd[descriptor]
			if (v)
				fdtable->fd[descriptor] = v & ~1;
			else
				BUG
	current->reservation_count = 0

That "commit" thing would be called on return from syscall
for userland threads and would be called explicitly in
kernel threads that have a descriptor table and work with
it.

The benefit would be that fd_install() would *NOT* have to be done
after the last possible failure exit - something that installs
a lot of files wouldn't have to play convoluted games on cleanup.
Simply returning an error would do the right thing.

Two stores into ->fd[] instead of one is not a big deal;
however, anything like task_work_add() to arrange the call
of "commit" ends up being bloody awful.

We could have it called from syscall glue directly, but
that means touching assembler for quite a few architectures,
and I hadn't gotten around to that.  Can be done, but it's
not a pleasant work...

  reply	other threads:[~2024-09-15 19:35 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-15 14:31 [PATCH v10 0/8] File abstractions needed by Rust Binder Alice Ryhl
2024-09-15 14:31 ` [PATCH v10 1/8] rust: types: add `NotThreadSafe` Alice Ryhl
2024-09-15 15:38   ` Gary Guo
2024-09-27 11:21     ` Miguel Ojeda
2024-09-24 19:45   ` Serge E. Hallyn
2024-09-25 11:06     ` Alice Ryhl
2024-09-25 13:59       ` Serge E. Hallyn
2024-09-27 10:20         ` Gary Guo
2024-09-15 14:31 ` [PATCH v10 2/8] rust: task: add `Task::current_raw` Alice Ryhl
2024-09-15 14:31 ` [PATCH v10 3/8] rust: file: add Rust abstraction for `struct file` Alice Ryhl
2024-09-15 21:51   ` Gary Guo
2024-09-15 14:31 ` [PATCH v10 4/8] rust: cred: add Rust abstraction for `struct cred` Alice Ryhl
2024-09-15 20:24   ` Kees Cook
2024-09-15 20:55     ` Alice Ryhl
2024-09-19  7:57   ` Paul Moore
2024-09-15 14:31 ` [PATCH v10 5/8] rust: security: add abstraction for secctx Alice Ryhl
2024-09-15 20:58   ` Kees Cook
2024-09-15 21:07     ` Alice Ryhl
2024-09-16 15:40       ` Casey Schaufler
2024-09-17 13:18         ` Paul Moore
2024-09-22 15:01           ` Alice Ryhl
2024-09-22 15:08         ` Alice Ryhl
2024-09-22 16:50           ` Casey Schaufler
2024-09-22 17:04             ` Alice Ryhl
2024-09-19  7:56   ` Paul Moore
2024-09-15 14:31 ` [PATCH v10 6/8] rust: file: add `FileDescriptorReservation` Alice Ryhl
2024-09-15 18:39   ` Al Viro
2024-09-15 19:34     ` Al Viro [this message]
2024-09-16  4:18       ` Al Viro
2024-09-15 20:13     ` Alice Ryhl
2024-09-15 22:01       ` Al Viro
2024-09-15 22:05         ` Al Viro
2024-09-15 14:31 ` [PATCH v10 7/8] rust: file: add `Kuid` wrapper Alice Ryhl
2024-09-15 22:02   ` Gary Guo
2024-09-23  9:13     ` Alice Ryhl
2024-09-26 16:33       ` Christian Brauner
2024-09-26 16:35         ` [PATCH] [RFC] rust: add PidNamespace wrapper Christian Brauner
2024-09-27 12:04           ` Alice Ryhl
2024-09-27 14:21             ` Christian Brauner
2024-09-27 14:58               ` Alice Ryhl
2024-10-01  9:43           ` [PATCH v2] rust: add PidNamespace Christian Brauner
2024-10-01 10:26             ` Alice Ryhl
2024-10-01 14:17               ` Christian Brauner
2024-10-01 15:45                 ` Miguel Ojeda
2024-10-02 10:14                   ` Christian Brauner
2024-10-02 11:08                     ` Miguel Ojeda
2024-10-01 19:10             ` Gary Guo
2024-10-02 11:05               ` Christian Brauner
2024-09-15 14:31 ` [PATCH v10 8/8] rust: file: add abstraction for `poll_table` Alice Ryhl
2024-09-15 22:24   ` Gary Guo
2024-09-23  9:10     ` Alice Ryhl
2024-09-27  9:28 ` [PATCH v10 0/8] File abstractions needed by Rust Binder Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240915193443.GK2825852@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=a.hindborg@samsung.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=arve@android.com \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=brauner@kernel.org \
    --cc=cmllamas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dxu@dxuuu.xyz \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jmorris@namei.org \
    --cc=joel@joelfernandes.org \
    --cc=kees@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=maco@android.com \
    --cc=ojeda@kernel.org \
    --cc=paul@paul-moore.com \
    --cc=peterz@infradead.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=serge@hallyn.com \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tkjos@android.com \
    --cc=tmgross@umich.edu \
    --cc=wedsonaf@gmail.com \
    --cc=willy@infradead.org \
    --cc=yakoyoku@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.