All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Hindborg <a.hindborg@kernel.org>
To: "Alice Ryhl" <aliceryhl@google.com>
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Benno Lossin" <benno.lossin@proton.me>,
	"Trevor Gross" <tmgross@umich.edu>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Matthew Maurer" <mmaurer@google.com>,
	"Lee Jones" <lee@kernel.org>,
	linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org
Subject: Re: [PATCH 1/5] rust: iov: add iov_iter abstractions for ITER_SOURCE
Date: Wed, 19 Mar 2025 19:54:18 +0100	[thread overview]
Message-ID: <871pushmw5.fsf@kernel.org> (raw)
In-Reply-To: <20250311-iov-iter-v1-1-f6c9134ea824@google.com> (Alice Ryhl's message of "Tue, 11 Mar 2025 14:25:12 +0000")

"Alice Ryhl" <aliceryhl@google.com> writes:

> This adds abstractions for the iov_iter type in the case where
> data_source is ITER_SOURCE. This will make Rust implementations of
> fops->write_iter possible.
>
> This series only has support for using existing IO vectors created by C
> code. Additional abstractions will be needed to support the creation of
> IO vectors in Rust code.
>
> These abstractions make the assumption that `struct iov_iter` does not
> have internal self-references, which implies that it is valid to move it
> between different local variables, and that you can make a copy of it to
> get two IO vectors into the same buffers.
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/kernel/iov.rs | 170 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  rust/kernel/lib.rs |   1 +
>  2 files changed, 171 insertions(+)
>
> diff --git a/rust/kernel/iov.rs b/rust/kernel/iov.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..4498f65e1f65bd964909810c020db3a9f8fae389
> --- /dev/null
> +++ b/rust/kernel/iov.rs
> @@ -0,0 +1,170 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2025 Google LLC.
> +
> +//! IO vectors.
> +//!
> +//! C headers: [`include/linux/iov_iter.h`](srctree/include/linux/iov_iter.h),
> +//! [`include/linux/uio.h`](srctree/include/linux/uio.h)
> +
> +use crate::{bindings, prelude::*, types::Opaque};
> +use core::{marker::PhantomData, mem::MaybeUninit, slice};
> +
> +const ITER_SOURCE: bool = bindings::ITER_SOURCE != 0;
> +const ITER_DEST: bool = bindings::ITER_DEST != 0;
> +
> +// Compile-time assertion for the above constants.
> +const _: () = {
> +    if ITER_SOURCE == ITER_DEST {
> +        panic!("ITER_DEST and ITER_SOURCE should be different.");
> +    }
> +};

Don't we have some kind of `build_assert` macro to do this? If not,
maybe we should.

> +
> +/// An IO vector that acts as a source of data.
> +///
> +/// # Invariants
> +///
> +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. The buffers
> +/// referenced by the IO vector must be valid for reading for the duration of `'data`.
> +///
> +/// Note that if the IO vector is backed by a userspace pointer, it is always considered valid for
> +/// reading.

How did you arrive at this conclusion? In the discussions with Ralph on
the coherent allocator thread, we arrived at conflicting answers.

> +#[repr(transparent)]
> +pub struct IovIterSource<'data> {
> +    iov: Opaque<bindings::iov_iter>,
> +    /// Represent to the type system that this value contains a pointer to readable data it does
> +    /// not own.
> +    _source: PhantomData<&'data [u8]>,
> +}
> +
> +// SAFETY: This struct is essentially just a fancy `std::io::Cursor<&[u8]>`, and that type is safe
> +// to send across thread boundaries.

I don't think this safety requirement is enough. Could you give the
argument without deferring to this type in `std`?

> +unsafe impl<'data> Send for IovIterSource<'data> {}
> +// SAFETY: This struct is essentially just a fancy `std::io::Cursor<&[u8]>`, and that type is safe
> +// to share across thread boundaries.

Same as above.

> +unsafe impl<'data> Sync for IovIterSource<'data> {}
> +
> +impl<'data> IovIterSource<'data> {
> +    /// Obtain an `IovIterSource` from a raw pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// * For the duration of `'iov`, the `struct iov_iter` must remain valid and must not be
> +    ///   accessed except through the returned reference.
> +    /// * For the duration of `'data`, the buffers backing this IO vector must be valid for
> +    ///   reading.
> +    #[track_caller]
> +    #[inline]
> +    pub unsafe fn from_raw<'iov>(ptr: *mut bindings::iov_iter) -> &'iov mut IovIterSource<'data> {
> +        // SAFETY: The caller ensures that `ptr` is valid.
> +        let data_source = unsafe { (*ptr).data_source };
> +        assert_eq!(data_source, ITER_SOURCE);
> +
> +        // SAFETY: The caller ensures the struct invariants for the right durations.
> +        unsafe { &mut *ptr.cast::<IovIterSource<'data>>() }
> +    }
> +
> +    /// Access this as a raw `struct iov_iter`.
> +    #[inline]
> +    pub fn as_raw(&mut self) -> *mut bindings::iov_iter {
> +        self.iov.get()
> +    }
> +
> +    /// Returns the number of bytes available in this IO vector.
> +    ///
> +    /// Note that this may overestimate the number of bytes. For example, reading from userspace
> +    /// memory could fail with EFAULT, which will be treated as the end of the IO vector.
> +    #[inline]
> +    pub fn len(&self) -> usize {
> +        // SAFETY: It is safe to access the `count` field.

Why?

> +        unsafe {
> +            (*self.iov.get())
> +                .__bindgen_anon_1
> +                .__bindgen_anon_1
> +                .as_ref()
> +                .count
> +        }
> +    }
> +
> +    /// Returns whether there are any bytes left in this IO vector.
> +    ///
> +    /// This may return `true` even if there are no more bytes available. For example, reading from
> +    /// userspace memory could fail with EFAULT, which will be treated as the end of the IO vector.
> +    #[inline]
> +    pub fn is_empty(&self) -> bool {
> +        self.len() == 0
> +    }
> +
> +    /// Advance this IO vector by `bytes` bytes.
> +    ///
> +    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
> +    #[inline]
> +    pub fn advance(&mut self, bytes: usize) {
> +        // SAFETY: `self.iov` is a valid IO vector.
> +        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
> +    }
> +
> +    /// Advance this IO vector backwards by `bytes` bytes.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The IO vector must not be reverted to before its beginning.
> +    #[inline]
> +    pub unsafe fn revert(&mut self, bytes: usize) {
> +        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
> +        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
> +    }
> +
> +    /// Read data from this IO vector.
> +    ///
> +    /// Returns the number of bytes that have been copied.
> +    #[inline]
> +    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
> +        // SAFETY: We will not write uninitialized bytes to `out`.

You don't know what you are reading from user space. That could be
uninit bytes?

> +        let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
> +
> +        self.copy_from_iter_raw(out).len()
> +    }
> +
> +    /// Read data from this IO vector and append it to a vector.
> +    ///
> +    /// Returns the number of bytes that have been copied.
> +    #[inline]
> +    pub fn copy_from_iter_vec<A: Allocator>(
> +        &mut self,
> +        out: &mut Vec<u8, A>,
> +        flags: Flags,
> +    ) -> Result<usize> {
> +        out.reserve(self.len(), flags)?;
> +        let len = self.copy_from_iter_raw(out.spare_capacity_mut()).len();
> +        // SAFETY: The next `len` bytes of the vector have been initialized.
> +        unsafe { out.set_len(out.len() + len) };
> +        Ok(len)
> +    }
> +
> +    /// Read data from this IO vector into potentially uninitialized memory.
> +    ///
> +    /// Returns the sub-slice of the output that has been initialized. If the returned slice is
> +    /// shorter than the input buffer, then the entire IO vector has been read.
> +    #[inline]
> +    pub fn copy_from_iter_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> &mut [u8] {
> +        // SAFETY: `out` is valid for `out.len()` bytes.
> +        let len =
> +            unsafe { bindings::_copy_from_iter(out.as_mut_ptr().cast(), out.len(), self.as_raw()) };
> +
> +        // SAFETY: We just initialized the first `len` bytes of `out`.
> +        unsafe { slice::from_raw_parts_mut(out.as_mut_ptr().cast(), len) }
> +    }
> +}
> +
> +impl<'data> Clone for IovIterSource<'data> {
> +    #[inline]
> +    fn clone(&self) -> IovIterSource<'data> {
> +        // SAFETY: This duplicates the bytes inside the `Opaque` value exactly. Since `struct
> +        // iov_iter` does not have any internal self references, that is okay.
> +        //
> +        // Since this IO vector only reads from the backing buffers, having multiple IO vectors to
> +        // the same source can't lead to data races on the backing buffers.
> +        unsafe { core::ptr::read(self) }

I'm not entirely sure about the following, so please enlighten me:

What is the `'data` here? Will it be feely assignable by the caller?
Because it's not elided to the lifetime of `&self`, right? If not, I
think the source `IovIterSouce` and associated buffers might go away
before the result of the clone dies.


Best regards,
Andreas Hindborg



  parent reply	other threads:[~2025-03-19 18:54 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-11 14:25 [PATCH 0/5] Rust support for `struct iov_iter` Alice Ryhl
2025-03-11 14:25 ` [PATCH 1/5] rust: iov: add iov_iter abstractions for ITER_SOURCE Alice Ryhl
2025-03-18 20:10   ` Christian Schrefl
2025-03-19 12:08     ` Alice Ryhl
2025-03-19 18:33   ` Andreas Hindborg
2025-03-19 18:54   ` Andreas Hindborg [this message]
2025-03-22 21:00     ` Miguel Ojeda
2025-03-22 22:05       ` Andreas Hindborg
2025-03-11 14:25 ` [PATCH 2/5] rust: iov: add iov_iter abstractions for ITER_DEST Alice Ryhl
2025-03-18 20:13   ` Christian Schrefl
2025-03-19 19:14     ` Andreas Hindborg
2025-03-19 20:01       ` Christian Schrefl
2025-03-19 21:12         ` Benno Lossin
2025-03-11 14:25 ` [PATCH 3/5] rust: miscdevice: Provide additional abstractions for iov_iter and kiocb structures Alice Ryhl
2025-03-19 19:26   ` Andreas Hindborg
2025-03-19 20:16   ` Andreas Hindborg
2025-03-11 14:25 ` [PATCH 4/5] rust: alloc: add Vec::clear Alice Ryhl
2025-03-11 14:40   ` Danilo Krummrich
2025-03-12  7:05     ` Alice Ryhl
2025-03-19 19:38   ` Andreas Hindborg
2025-03-19 20:22   ` Tamir Duberstein
2025-03-11 14:25 ` [PATCH 5/5] samples: rust_misc_device: Expand the sample to support read()ing from userspace Alice Ryhl
2025-03-19 20:11   ` Andreas Hindborg
2025-03-11 14:37 ` [PATCH 0/5] Rust support for `struct iov_iter` Greg Kroah-Hartman
2025-03-12  2:16   ` Benno Lossin
2025-03-12  6:47     ` Greg Kroah-Hartman
2025-03-18 20:57   ` Andreas Hindborg
2025-03-18 23:34     ` Greg Kroah-Hartman
2025-03-19 11:10       ` Andreas Hindborg
2025-03-19 11:42         ` Greg Kroah-Hartman
2025-03-19 18:24           ` Andreas Hindborg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871pushmw5.fsf@kernel.org \
    --to=a.hindborg@kernel.org \
    --cc=aliceryhl@google.com \
    --cc=arnd@arndb.de \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dakr@kernel.org \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=lee@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mmaurer@google.com \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tmgross@umich.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.