linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: linux-fsdevel@vger.kernel.org
Cc: "Paul Moore" <paul@paul-moore.com>,
	"James Morris" <jmorris@namei.org>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Christian Brauner" <brauner@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Wedson Almeida Filho" <wedsonaf@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Benno Lossin" <benno.lossin@proton.me>,
	"Andreas Hindborg" <a.hindborg@samsung.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Arve Hjønnevåg" <arve@android.com>,
	"Todd Kjos" <tkjos@android.com>,
	"Martijn Coenen" <maco@android.com>,
	"Joel Fernandes" <joel@joelfernandes.org>,
	"Carlos Llamas" <cmllamas@google.com>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Daniel Xu" <dxu@dxuuu.xyz>,
	"Martin Rodriguez Reboredo" <yakoyoku@gmail.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	rust-for-linux@vger.kernel.org, "Kees Cook" <kees@kernel.org>,
	"Alice Ryhl" <aliceryhl@google.com>
Subject: Re: [PATCH v10 6/8] rust: file: add `FileDescriptorReservation`
Date: Mon, 16 Sep 2024 05:18:21 +0100	[thread overview]
Message-ID: <20240916041821.GN2825852@ZenIV> (raw)
In-Reply-To: <20240915193443.GK2825852@ZenIV>

On Sun, Sep 15, 2024 at 08:34:43PM +0100, Al Viro wrote:

> FWIW, I toyed with the idea of having reservations kept per-thread;
> it is possible and it simplifies some things, but I hadn't been able to
> find a way to do that without buggering syscall latency for open() et.al.

Hmm...  How about the following:

* add an equivalent of array of pairs (fd, file) to task_struct;
representation could be e.g. (a _VERY_ preliminary variant)
	unsigned fd_count;
	int fds[2];
	struct file *fp[2];
	void *spillover;
with 'spillover' being a separately allocated array of pairs to deal with
the moments when we have more than 2 simultaneously reserved descriptors.
Initially NULL, allocated the first time we need more than 2.  Always empty
outside of syscall.

* inline primitives:
	count_reserved_fds()
	reserved_descriptor(index)
	reserved_file(index)

* int reserve_fd(flags)
	returns -E... or index.

	slot = current->fd_count
	if (unlikely(slot == 2) && !current->spillover) {
		allocate spillover
		if failed
			return -ENOMEM
		set current->spillover
	}
	if slot is maximal allowed (2 + how much fits into allocated part?)
		return -E<something>
	fd = get_unused_fd_flags(flags);
	if (unlikely(fd < 0))
		return fd;
	if (likely(slot < 2)) {
		current->fds[slot] = fd;
		current->fp[slot] = NULL;
	} else {
		store (fd, NULL) into element #(slot - 2) of current->spillover
	}
	current->fd_count = slot + 1;

* void install_file(index, file)
	
	if (likely(slot < 2))
		current->fp[slot] = file;
	else
		store file to element #(slot - 2) of current->spillover

* void __commit_reservations(unsigned count, bool failed)
	// count == current->fd_count

	while (count--) {
		fd = reserved_descriptor(count);
		file = reserved_file(count);
		if (!file)
			put_unused_fd(fd);
		else if (!failed)
			fd_install(fd, file);
		else {
			put_unused_fd(fd);
			fput(file);
		}
	}
	current->fd_count = 0;

* static inline void commit_fd_reservations(bool failed)
	called in syscall glue, right after the syscall returns
	
	unsigned slots = current->fd_count;
	if (unlikely(slots))
		__commit_reservations(slots, failed);


Then we can (in addition to the current use of get_unused_fd_flags() et.al. -
that still works) do e.g. things like

	for (i = 0; i < 69; i++) {
		index = reserve_fd(FD_CLOEXEC);

		if (unlikely(index < 0))
			return index;

		file = some_driver_shite(some_shite, i);
		if (IS_ERR(file))
			return PTR_ERR(file);

		install_file(index, file); // consumed file

		ioctl_result.some_array[i] = reserved_descriptor(index);
		....
	}
	...
	if (copy_to_user(arg, &ioctl_result, sizeof(ioctl_result))
		return -EFAULT;
	...
	return 0;

and have it DTRT on all failures, no matter how many files we have added,
etc. - on syscall return we will either commit all reservations
(on success) or release all reserved descriptors and drop all files we
had planned to put into descriptor table.  Getting that right manually
is doable (drm has some examples), but it's _not_ pleasant.

The win here is in simpler cleanup code.  And it can coexist with the
current API just fine.  The PITA is in the need to add the call
of commit_fd_reservations() in syscall exit glue and have that done
on all architectures ;-/

FWIW, I suspect that it won't be slower than the current API, even
if used on hot paths.  pipe(2) would be an interesting testcase
for that - converting it is easy, and there's a plenty of loads
where latency of pipe(2) would be visible.

Comments?

  reply	other threads:[~2024-09-16  4:18 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-15 14:31 [PATCH v10 0/8] File abstractions needed by Rust Binder Alice Ryhl
2024-09-15 14:31 ` [PATCH v10 1/8] rust: types: add `NotThreadSafe` Alice Ryhl
2024-09-15 15:38   ` Gary Guo
2024-09-27 11:21     ` Miguel Ojeda
2024-09-24 19:45   ` Serge E. Hallyn
2024-09-25 11:06     ` Alice Ryhl
2024-09-25 13:59       ` Serge E. Hallyn
2024-09-27 10:20         ` Gary Guo
2024-09-15 14:31 ` [PATCH v10 2/8] rust: task: add `Task::current_raw` Alice Ryhl
2024-09-15 14:31 ` [PATCH v10 3/8] rust: file: add Rust abstraction for `struct file` Alice Ryhl
2024-09-15 21:51   ` Gary Guo
2024-09-15 14:31 ` [PATCH v10 4/8] rust: cred: add Rust abstraction for `struct cred` Alice Ryhl
2024-09-15 20:24   ` Kees Cook
2024-09-15 20:55     ` Alice Ryhl
2024-09-19  7:57   ` Paul Moore
2024-09-15 14:31 ` [PATCH v10 5/8] rust: security: add abstraction for secctx Alice Ryhl
2024-09-15 20:58   ` Kees Cook
2024-09-15 21:07     ` Alice Ryhl
2024-09-16 15:40       ` Casey Schaufler
2024-09-17 13:18         ` Paul Moore
2024-09-22 15:01           ` Alice Ryhl
2024-09-22 15:08         ` Alice Ryhl
2024-09-22 16:50           ` Casey Schaufler
2024-09-22 17:04             ` Alice Ryhl
2024-09-19  7:56   ` Paul Moore
2024-09-15 14:31 ` [PATCH v10 6/8] rust: file: add `FileDescriptorReservation` Alice Ryhl
2024-09-15 18:39   ` Al Viro
2024-09-15 19:34     ` Al Viro
2024-09-16  4:18       ` Al Viro [this message]
2024-09-15 20:13     ` Alice Ryhl
2024-09-15 22:01       ` Al Viro
2024-09-15 22:05         ` Al Viro
2024-09-15 14:31 ` [PATCH v10 7/8] rust: file: add `Kuid` wrapper Alice Ryhl
2024-09-15 22:02   ` Gary Guo
2024-09-23  9:13     ` Alice Ryhl
2024-09-26 16:33       ` Christian Brauner
2024-09-26 16:35       ` [PATCH] [RFC] rust: add PidNamespace wrapper Christian Brauner
2024-09-27 12:04         ` Alice Ryhl
2024-09-27 14:21           ` Christian Brauner
2024-09-27 14:58             ` Alice Ryhl
2024-10-01  9:43         ` [PATCH v2] rust: add PidNamespace Christian Brauner
2024-10-01 10:26           ` Alice Ryhl
2024-10-01 14:17             ` Christian Brauner
2024-10-01 15:45               ` Miguel Ojeda
2024-10-02 10:14                 ` Christian Brauner
2024-10-02 11:08                   ` Miguel Ojeda
2024-10-01 19:10           ` Gary Guo
2024-10-02 11:05             ` Christian Brauner
2024-09-15 14:31 ` [PATCH v10 8/8] rust: file: add abstraction for `poll_table` Alice Ryhl
2024-09-15 22:24   ` Gary Guo
2024-09-23  9:10     ` Alice Ryhl
2024-09-27  9:28 ` [PATCH v10 0/8] File abstractions needed by Rust Binder Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240916041821.GN2825852@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=a.hindborg@samsung.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=arve@android.com \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=brauner@kernel.org \
    --cc=cmllamas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dxu@dxuuu.xyz \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jmorris@namei.org \
    --cc=joel@joelfernandes.org \
    --cc=kees@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=maco@android.com \
    --cc=ojeda@kernel.org \
    --cc=paul@paul-moore.com \
    --cc=peterz@infradead.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=serge@hallyn.com \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tkjos@android.com \
    --cc=tmgross@umich.edu \
    --cc=wedsonaf@gmail.com \
    --cc=willy@infradead.org \
    --cc=yakoyoku@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).