public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andreas Hindborg <a.hindborg@kernel.org>
To: "Alice Ryhl" <aliceryhl@google.com>
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Benno Lossin" <benno.lossin@proton.me>,
	"Trevor Gross" <tmgross@umich.edu>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Matthew Maurer" <mmaurer@google.com>,
	"Lee Jones" <lee@kernel.org>,
	linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org
Subject: Re: [PATCH 1/5] rust: iov: add iov_iter abstractions for ITER_SOURCE
Date: Wed, 19 Mar 2025 19:54:18 +0100	[thread overview]
Message-ID: <871pushmw5.fsf@kernel.org> (raw)
In-Reply-To: <20250311-iov-iter-v1-1-f6c9134ea824@google.com> (Alice Ryhl's message of "Tue, 11 Mar 2025 14:25:12 +0000")

"Alice Ryhl" <aliceryhl@google.com> writes:

> This adds abstractions for the iov_iter type in the case where
> data_source is ITER_SOURCE. This will make Rust implementations of
> fops->write_iter possible.
>
> This series only has support for using existing IO vectors created by C
> code. Additional abstractions will be needed to support the creation of
> IO vectors in Rust code.
>
> These abstractions make the assumption that `struct iov_iter` does not
> have internal self-references, which implies that it is valid to move it
> between different local variables, and that you can make a copy of it to
> get two IO vectors into the same buffers.
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/kernel/iov.rs | 170 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  rust/kernel/lib.rs |   1 +
>  2 files changed, 171 insertions(+)
>
> diff --git a/rust/kernel/iov.rs b/rust/kernel/iov.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..4498f65e1f65bd964909810c020db3a9f8fae389
> --- /dev/null
> +++ b/rust/kernel/iov.rs
> @@ -0,0 +1,170 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2025 Google LLC.
> +
> +//! IO vectors.
> +//!
> +//! C headers: [`include/linux/iov_iter.h`](srctree/include/linux/iov_iter.h),
> +//! [`include/linux/uio.h`](srctree/include/linux/uio.h)
> +
> +use crate::{bindings, prelude::*, types::Opaque};
> +use core::{marker::PhantomData, mem::MaybeUninit, slice};
> +
> +const ITER_SOURCE: bool = bindings::ITER_SOURCE != 0;
> +const ITER_DEST: bool = bindings::ITER_DEST != 0;
> +
> +// Compile-time assertion for the above constants.
> +const _: () = {
> +    if ITER_SOURCE == ITER_DEST {
> +        panic!("ITER_DEST and ITER_SOURCE should be different.");
> +    }
> +};

Don't we have some kind of `build_assert` macro to do this? If not,
maybe we should.

> +
> +/// An IO vector that acts as a source of data.
> +///
> +/// # Invariants
> +///
> +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. The buffers
> +/// referenced by the IO vector must be valid for reading for the duration of `'data`.
> +///
> +/// Note that if the IO vector is backed by a userspace pointer, it is always considered valid for
> +/// reading.

How did you arrive at this conclusion? In the discussions with Ralph on
the coherent allocator thread, we arrived at conflicting answers.

> +#[repr(transparent)]
> +pub struct IovIterSource<'data> {
> +    iov: Opaque<bindings::iov_iter>,
> +    /// Represent to the type system that this value contains a pointer to readable data it does
> +    /// not own.
> +    _source: PhantomData<&'data [u8]>,
> +}
> +
> +// SAFETY: This struct is essentially just a fancy `std::io::Cursor<&[u8]>`, and that type is safe
> +// to send across thread boundaries.

I don't think this safety requirement is enough. Could you give the
argument without deferring to this type in `std`?

> +unsafe impl<'data> Send for IovIterSource<'data> {}
> +// SAFETY: This struct is essentially just a fancy `std::io::Cursor<&[u8]>`, and that type is safe
> +// to share across thread boundaries.

Same as above.

> +unsafe impl<'data> Sync for IovIterSource<'data> {}
> +
> +impl<'data> IovIterSource<'data> {
> +    /// Obtain an `IovIterSource` from a raw pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// * For the duration of `'iov`, the `struct iov_iter` must remain valid and must not be
> +    ///   accessed except through the returned reference.
> +    /// * For the duration of `'data`, the buffers backing this IO vector must be valid for
> +    ///   reading.
> +    #[track_caller]
> +    #[inline]
> +    pub unsafe fn from_raw<'iov>(ptr: *mut bindings::iov_iter) -> &'iov mut IovIterSource<'data> {
> +        // SAFETY: The caller ensures that `ptr` is valid.
> +        let data_source = unsafe { (*ptr).data_source };
> +        assert_eq!(data_source, ITER_SOURCE);
> +
> +        // SAFETY: The caller ensures the struct invariants for the right durations.
> +        unsafe { &mut *ptr.cast::<IovIterSource<'data>>() }
> +    }
> +
> +    /// Access this as a raw `struct iov_iter`.
> +    #[inline]
> +    pub fn as_raw(&mut self) -> *mut bindings::iov_iter {
> +        self.iov.get()
> +    }
> +
> +    /// Returns the number of bytes available in this IO vector.
> +    ///
> +    /// Note that this may overestimate the number of bytes. For example, reading from userspace
> +    /// memory could fail with EFAULT, which will be treated as the end of the IO vector.
> +    #[inline]
> +    pub fn len(&self) -> usize {
> +        // SAFETY: It is safe to access the `count` field.

Why?

> +        unsafe {
> +            (*self.iov.get())
> +                .__bindgen_anon_1
> +                .__bindgen_anon_1
> +                .as_ref()
> +                .count
> +        }
> +    }
> +
> +    /// Returns whether there are any bytes left in this IO vector.
> +    ///
> +    /// This may return `true` even if there are no more bytes available. For example, reading from
> +    /// userspace memory could fail with EFAULT, which will be treated as the end of the IO vector.
> +    #[inline]
> +    pub fn is_empty(&self) -> bool {
> +        self.len() == 0
> +    }
> +
> +    /// Advance this IO vector by `bytes` bytes.
> +    ///
> +    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
> +    #[inline]
> +    pub fn advance(&mut self, bytes: usize) {
> +        // SAFETY: `self.iov` is a valid IO vector.
> +        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
> +    }
> +
> +    /// Advance this IO vector backwards by `bytes` bytes.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The IO vector must not be reverted to before its beginning.
> +    #[inline]
> +    pub unsafe fn revert(&mut self, bytes: usize) {
> +        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
> +        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
> +    }
> +
> +    /// Read data from this IO vector.
> +    ///
> +    /// Returns the number of bytes that have been copied.
> +    #[inline]
> +    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
> +        // SAFETY: We will not write uninitialized bytes to `out`.

You don't know what you are reading from user space. That could be
uninit bytes?

> +        let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
> +
> +        self.copy_from_iter_raw(out).len()
> +    }
> +
> +    /// Read data from this IO vector and append it to a vector.
> +    ///
> +    /// Returns the number of bytes that have been copied.
> +    #[inline]
> +    pub fn copy_from_iter_vec<A: Allocator>(
> +        &mut self,
> +        out: &mut Vec<u8, A>,
> +        flags: Flags,
> +    ) -> Result<usize> {
> +        out.reserve(self.len(), flags)?;
> +        let len = self.copy_from_iter_raw(out.spare_capacity_mut()).len();
> +        // SAFETY: The next `len` bytes of the vector have been initialized.
> +        unsafe { out.set_len(out.len() + len) };
> +        Ok(len)
> +    }
> +
> +    /// Read data from this IO vector into potentially uninitialized memory.
> +    ///
> +    /// Returns the sub-slice of the output that has been initialized. If the returned slice is
> +    /// shorter than the input buffer, then the entire IO vector has been read.
> +    #[inline]
> +    pub fn copy_from_iter_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> &mut [u8] {
> +        // SAFETY: `out` is valid for `out.len()` bytes.
> +        let len =
> +            unsafe { bindings::_copy_from_iter(out.as_mut_ptr().cast(), out.len(), self.as_raw()) };
> +
> +        // SAFETY: We just initialized the first `len` bytes of `out`.
> +        unsafe { slice::from_raw_parts_mut(out.as_mut_ptr().cast(), len) }
> +    }
> +}
> +
> +impl<'data> Clone for IovIterSource<'data> {
> +    #[inline]
> +    fn clone(&self) -> IovIterSource<'data> {
> +        // SAFETY: This duplicates the bytes inside the `Opaque` value exactly. Since `struct
> +        // iov_iter` does not have any internal self references, that is okay.
> +        //
> +        // Since this IO vector only reads from the backing buffers, having multiple IO vectors to
> +        // the same source can't lead to data races on the backing buffers.
> +        unsafe { core::ptr::read(self) }

I'm not entirely sure about the following, so please enlighten me:

What is the `'data` here? Will it be feely assignable by the caller?
Because it's not elided to the lifetime of `&self`, right? If not, I
think the source `IovIterSouce` and associated buffers might go away
before the result of the clone dies.


Best regards,
Andreas Hindborg



  parent reply	other threads:[~2025-03-19 18:54 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-11 14:25 [PATCH 0/5] Rust support for `struct iov_iter` Alice Ryhl
2025-03-11 14:25 ` [PATCH 1/5] rust: iov: add iov_iter abstractions for ITER_SOURCE Alice Ryhl
2025-03-18 20:10   ` Christian Schrefl
2025-03-19 12:08     ` Alice Ryhl
2025-03-19 18:33   ` Andreas Hindborg
2025-03-19 18:54   ` Andreas Hindborg [this message]
2025-03-22 21:00     ` Miguel Ojeda
2025-03-22 22:05       ` Andreas Hindborg
2025-03-11 14:25 ` [PATCH 2/5] rust: iov: add iov_iter abstractions for ITER_DEST Alice Ryhl
2025-03-18 20:13   ` Christian Schrefl
2025-03-19 19:14     ` Andreas Hindborg
2025-03-19 20:01       ` Christian Schrefl
2025-03-19 21:12         ` Benno Lossin
2025-03-11 14:25 ` [PATCH 3/5] rust: miscdevice: Provide additional abstractions for iov_iter and kiocb structures Alice Ryhl
2025-03-19 19:26   ` Andreas Hindborg
2025-03-19 20:16   ` Andreas Hindborg
2025-03-11 14:25 ` [PATCH 4/5] rust: alloc: add Vec::clear Alice Ryhl
2025-03-11 14:40   ` Danilo Krummrich
2025-03-12  7:05     ` Alice Ryhl
2025-03-19 19:38   ` Andreas Hindborg
2025-03-19 20:22   ` Tamir Duberstein
2025-03-11 14:25 ` [PATCH 5/5] samples: rust_misc_device: Expand the sample to support read()ing from userspace Alice Ryhl
2025-03-19 20:11   ` Andreas Hindborg
2025-03-11 14:37 ` [PATCH 0/5] Rust support for `struct iov_iter` Greg Kroah-Hartman
2025-03-12  2:16   ` Benno Lossin
2025-03-12  6:47     ` Greg Kroah-Hartman
2025-03-18 20:57   ` Andreas Hindborg
2025-03-18 23:34     ` Greg Kroah-Hartman
2025-03-19 11:10       ` Andreas Hindborg
2025-03-19 11:42         ` Greg Kroah-Hartman
2025-03-19 18:24           ` Andreas Hindborg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871pushmw5.fsf@kernel.org \
    --to=a.hindborg@kernel.org \
    --cc=aliceryhl@google.com \
    --cc=arnd@arndb.de \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dakr@kernel.org \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=lee@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mmaurer@google.com \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tmgross@umich.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox