All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alice Ryhl <aliceryhl@google.com>
To: Andreas Hindborg <a.hindborg@kernel.org>
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Matthew Maurer" <mmaurer@google.com>,
	"Lee Jones" <lee@kernel.org>,
	linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org,
	"Benno Lossin" <lossin@kernel.org>
Subject: Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE
Date: Wed, 9 Jul 2025 11:07:34 +0000	[thread overview]
Message-ID: <aG5NdqmUdvUHqUju@google.com> (raw)
In-Reply-To: <878qkyoi6d.fsf@kernel.org>

On Tue, Jul 08, 2025 at 04:45:14PM +0200, Andreas Hindborg wrote:
> "Alice Ryhl" <aliceryhl@google.com> writes:
> > +/// # Invariants
> > +///
> > +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. For the duration
> > +/// of `'data`, it must be safe to read the data in this IO vector.
> 
> In my opinion, the phrasing you had in v1 was better:
> 
>   The buffers referenced by the IO vector must be valid for reading for
>   the duration of `'data`.
> 
> That is, I would prefer "must be valid for reading" over "it must be
> safe to read ...".

If it's backed by userspace data, then technically there aren't any
buffers that are valid for reading in the usual sense. We need to call
into special assembly to read it, and a normal pointer dereference would
be illegal.

> > +    /// Returns the number of bytes available in this IO vector.
> > +    ///
> > +    /// Note that this may overestimate the number of bytes. For example, reading from userspace
> > +    /// memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> > +    #[inline]
> > +    pub fn len(&self) -> usize {
> > +        // SAFETY: It is safe to access the `count` field.
> 
> Reiterating my comment from v1: Why?

It's the same reason as why this is safe:

struct HasLength {
    length: usize,
}
impl HasLength {
    fn len(&self) -> usize {
        // why is this safe?
        self.length
    }
}

I'm not sure how to say it concisely. I guess it's because all access to
the iov_iter goes through the &IovIterSource.

> > +        unsafe {
> > +            (*self.iov.get())
> > +                .__bindgen_anon_1
> > +                .__bindgen_anon_1
> > +                .as_ref()
> > +                .count
> > +        }
> > +    }
> > +
> > +    /// Returns whether there are any bytes left in this IO vector.
> > +    ///
> > +    /// This may return `true` even if there are no more bytes available. For example, reading from
> > +    /// userspace memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> > +    #[inline]
> > +    pub fn is_empty(&self) -> bool {
> > +        self.len() == 0
> > +    }
> > +
> > +    /// Advance this IO vector by `bytes` bytes.
> > +    ///
> > +    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
> > +    #[inline]
> > +    pub fn advance(&mut self, bytes: usize) {
> > +        // SAFETY: `self.iov` is a valid IO vector.
> > +        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
> > +    }
> > +
> > +    /// Advance this IO vector backwards by `bytes` bytes.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// The IO vector must not be reverted to before its beginning.
> > +    #[inline]
> > +    pub unsafe fn revert(&mut self, bytes: usize) {
> > +        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
> > +        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
> > +    }
> > +
> > +    /// Read data from this IO vector.
> > +    ///
> > +    /// Returns the number of bytes that have been copied.
> > +    #[inline]
> > +    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
> > +        // SAFETY: We will not write uninitialized bytes to `out`.
> 
> Can you provide something to back this claim?

I guess the logic could go along these lines:

* If the iov_iter reads from userspace, then it's because we always
  consider such reads to produce initialized data.
* If the iov_iter reads from a kernel buffer, then the creator of the
  iov_iter must provide an initialized buffer.

Ultimately, if we don't know that the bytes are initialized, then it's
impossible to use the API correctly because you can never inspect the
bytes in any way. I.e., any implementation of copy_from_iter that
produces uninit data is necessarily buggy.

Alice

  reply	other threads:[~2025-07-09 11:07 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-04  9:25 [PATCH v2 0/4] Rust support for `struct iov_iter` Alice Ryhl
2025-07-04  9:26 ` [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE Alice Ryhl
2025-07-08 14:45   ` Andreas Hindborg
2025-07-09 11:07     ` Alice Ryhl [this message]
2025-07-09 11:56       ` Andreas Hindborg
2025-07-09 12:35         ` Alice Ryhl
2025-07-09 17:05           ` Andreas Hindborg
2025-07-14 12:18             ` Alice Ryhl
2025-08-05 10:48               ` Andreas Hindborg
2025-07-04  9:26 ` [PATCH v2 2/4] rust: iov: add iov_iter abstractions for ITER_DEST Alice Ryhl
2025-07-08 14:47   ` Andreas Hindborg
2025-07-09 10:58     ` Alice Ryhl
2025-07-04  9:26 ` [PATCH v2 3/4] rust: miscdevice: Provide additional abstractions for iov_iter and kiocb structures Alice Ryhl
2025-07-08 14:51   ` Andreas Hindborg
2025-07-09 11:09     ` Alice Ryhl
2025-07-09 11:58       ` Andreas Hindborg
2025-07-08 14:53   ` Andreas Hindborg
2025-07-09 11:12     ` Alice Ryhl
2025-07-09 11:59       ` Andreas Hindborg
2025-07-04  9:26 ` [PATCH v2 4/4] samples: rust_misc_device: Expand the sample to support read()ing from userspace Alice Ryhl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aG5NdqmUdvUHqUju@google.com \
    --to=aliceryhl@google.com \
    --cc=a.hindborg@kernel.org \
    --cc=arnd@arndb.de \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dakr@kernel.org \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=lee@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=mmaurer@google.com \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tmgross@umich.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.