Re: [PATCH v3] rust: alloc: implement `extend` for `Vec`

public inbox for rust-for-linux@vger.kernel.org
 help / color / mirror / Atom feed

From: Benno Lossin <benno.lossin@proton.me>
To: "Alexandre Courbot" <acourbot@nvidia.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Andreas Hindborg" <a.hindborg@kernel.org>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Trevor Gross" <tmgross@umich.edu>
Cc: Joel Fernandes <joelagnelf@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] rust: alloc: implement `extend` for `Vec`
Date: Mon, 07 Apr 2025 16:33:25 +0000	[thread overview]
Message-ID: <D90JUX1YE0MR.25XH5EZLUCDBM@proton.me> (raw)

On Sun Apr 6, 2025 at 3:01 PM CEST, Alexandre Courbot wrote:
> diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
> index ae9d072741cedbb34bed0be0c20cc75472aa53be..b3c4227e232cca3b5c17930abc63810240b9c4da 100644
> --- a/rust/kernel/alloc/kvec.rs
> +++ b/rust/kernel/alloc/kvec.rs
> @@ -454,30 +454,86 @@ pub fn reserve(&mut self, additional: usize, flags: Flags) -> Result<(), AllocEr
>      }
>  }
>  
> +impl<T, A: Allocator> Vec<T, A> {
> +    /// Extends the vector by the elements of `iter`.
> +    ///
> +    /// This uses [`Iterator::size_hint`] to optimize memory reallocations, but will work even with
> +    /// imprecise implementations - albeit in a non-optimal way.
> +    ///
> +    /// This method returns an error if a memory reallocation required to accommodate the new items
> +    /// failed. In this case, callers must assume that some (but not all) elements of `iter` might
> +    /// have been added to the vector.

I would add that items that haven't been added to the vector remain in
the iterator. Do note that some iterators drops the items when the
iterator is dropped. But if not, one can still access them later:

    let vec1 = vec![...];
    let mut vec2 = KVec::new();
    let mut iter = vec1.iter();
    if let Err(_) = vec2.extend(&mut iter) {
        // can access remaining items:
        for item in iter {
            pr_info!("item: {item:?}\n");
        }
    }

> +    ///
> +    /// # Note on optimal behavior and correctness
> +    ///
> +    /// The efficiency of this method depends on how reliable the [`Iterator::size_hint`]
> +    /// implementation of the `iter` is.
> +    ///
> +    /// It performs optimally with at most a single memory reallocation if the lower bound of
> +    /// `size_hint` is the exact number of items actually yielded.
> +    ///
> +    /// If `size_hint` is more vague, there may be as many memory reallocations as necessary to
> +    /// cover the whole iterator from the successive lower bounds returned by `size_hint`.
> +    ///
> +    /// If `size_hint` signals more items than actually yielded by the iterator, some unused memory
> +    /// might be reserved.
> +    ///
> +    /// Finally, whenever `size_hint` returns `(0, Some(0))`, the method assumes that no more items
> +    /// are yielded by the iterator and returns. This may result in some items not being added if
> +    /// there were still some remaining.

Why do this? Can't we just call `next` and see if that is `None` too?
You could use `Peekable` [1] to call `peek` instead, then the logic
below shouldn't be too complex.

[1]: https://doc.rust-lang.org/core/iter/trait.Iterator.html#method.peekable

> +    ///
> +    /// In the kernel most iterators are expected to have a precise and correct `size_hint`
> +    /// implementation, so this should nicely optimize out for these cases.
> +    pub fn extend<I>(&mut self, iter: I, flags: Flags) -> Result<(), AllocError>
> +    where
> +        I: IntoIterator<Item = T>,
> +    {
> +        let mut iter = iter.into_iter();
> +
> +        loop {
> +            let low_bound = match iter.size_hint() {
> +                // No more items expected, we can return.
> +                (0, Some(0)) => break,
> +                // Possibly more items but not certain, tentatively add one.
> +                (0, _) => 1,
> +                // More items pending, reserve space for the lower bound.
> +                (low_bound, _) => low_bound,
> +            };
> +
> +            self.reserve(low_bound, flags)?;
> +
> +            // Number of items we actually added.
> +            let added_items = self
> +                .spare_capacity_mut()
> +                .iter_mut()
> +                // Take a mutable reference to the iterator so we can reuse it in the next
> +                // iteration of the loop if needed.
> +                .zip(&mut iter)
> +                .fold(0, |count, (dst, src)| {
> +                    dst.write(src);
> +
> +                    count + 1
> +                });
> +
> +            // SAFETY:
> +            // - `self.len() + added_items <= self.capacity()` due to the call to `reserve` above,

In my mind this precondition holds, since

    self.spare_capacity_mut().len() + self.len() == self.capacity()

and

    added_items <= self.spare_capacity_mut().len()

But I guess we haven't written the first one down anywhere.

> +            // - items `[self.len()..self.len() + added_items - 1]` are initialized.
> +            unsafe { self.set_len(self.len() + added_items) };
> +
> +            // `size_hint` was incorrect and our iterator ended before its advertized lower bound.
> +            if added_items < low_bound {
> +                break;
> +            }
> +        }
> +
> +        Ok(())
> +    }
> +}
> +
>  impl<T: Clone, A: Allocator> Vec<T, A> {
>      /// Extend the vector by `n` clones of `value`.
>      pub fn extend_with(&mut self, n: usize, value: T, flags: Flags) -> Result<(), AllocError> {
> -        if n == 0 {
> -            return Ok(());
> -        }
> -
> -        self.reserve(n, flags)?;
> -
> -        let spare = self.spare_capacity_mut();
> -
> -        for item in spare.iter_mut().take(n - 1) {
> -            item.write(value.clone());
> -        }
> -
> -        // We can write the last element directly without cloning needlessly.
> -        spare[n - 1].write(value);
> -
> -        // SAFETY:
> -        // - `self.len() + n < self.capacity()` due to the call to reserve above,
> -        // - the loop and the line above initialized the next `n` elements.
> -        unsafe { self.set_len(self.len() + n) };
> -
> -        Ok(())
> +        self.extend(core::iter::repeat_n(value, n), flags)

`repeat_n` is not stable in 1.78.0, so you need to add `iter_repeat_n`
to the allowed features.

---
Cheers,
Benno

>      }
>  
>      /// Pushes clones of the elements of slice into the [`Vec`] instance.
> @@ -496,18 +552,7 @@ pub fn extend_with(&mut self, n: usize, value: T, flags: Flags) -> Result<(), Al
>      /// # Ok::<(), Error>(())
>      /// ```
>      pub fn extend_from_slice(&mut self, other: &[T], flags: Flags) -> Result<(), AllocError> {
> -        self.reserve(other.len(), flags)?;
> -        for (slot, item) in core::iter::zip(self.spare_capacity_mut(), other) {
> -            slot.write(item.clone());
> -        }
> -
> -        // SAFETY:
> -        // - `other.len()` spare entries have just been initialized, so it is safe to increase
> -        //   the length by the same number.
> -        // - `self.len() + other.len() <= self.capacity()` is guaranteed by the preceding `reserve`
> -        //   call.
> -        unsafe { self.set_len(self.len() + other.len()) };
> -        Ok(())
> +        self.extend(other.iter().cloned(), flags)
>      }
>  
>      /// Create a new `Vec<T, A>` and extend it by `n` clones of `value`.
>
> ---
> base-commit: a2cc6ff5ec8f91bc463fd3b0c26b61166a07eb11
> change-id: 20250405-vec_extend-4321251acc21
>
> Best regards,

next             reply	other threads:[~2025-04-07 16:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-07 16:33 Benno Lossin [this message]
2025-04-08 14:00 ` [PATCH v3] rust: alloc: implement `extend` for `Vec` Alexandre Courbot
  -- strict thread matches above, loose matches on Subject: below --
2025-04-06 13:01 Alexandre Courbot
2025-04-07 11:01 ` Danilo Krummrich
2025-04-08 13:34   ` Alexandre Courbot
2025-04-21  8:15     ` Alexandre Courbot
2025-04-22 17:03       ` Danilo Krummrich
2025-04-23  1:02         ` Alexandre Courbot
2025-04-23  8:51           ` Alice Ryhl
2025-04-23  9:40             ` Alexandre Courbot
2025-04-23 16:03               ` Boqun Feng
2025-04-24 11:50                 ` Alice Ryhl
2025-04-24 13:36                   ` Boqun Feng
2025-04-23  9:47           ` Danilo Krummrich
2025-04-23 13:15             ` Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D90JUX1YE0MR.25XH5EZLUCDBM@proton.me \
    --to=benno.lossin@proton.me \
    --cc=a.hindborg@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dakr@kernel.org \
    --cc=gary@garyguo.net \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tmgross@umich.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox