public inbox for rust-for-linux@vger.kernel.org
 help / color / mirror / Atom feed
From: "Alexandre Courbot" <acourbot@nvidia.com>
To: "Benno Lossin" <benno.lossin@proton.me>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Andreas Hindborg" <a.hindborg@kernel.org>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Trevor Gross" <tmgross@umich.edu>
Cc: "Joel Fernandes" <joelagnelf@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	<rust-for-linux@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3] rust: alloc: implement `extend` for `Vec`
Date: Tue, 08 Apr 2025 23:00:03 +0900	[thread overview]
Message-ID: <D91B844M3AIC.1HUEPDZZX2O9C@nvidia.com> (raw)
In-Reply-To: <D90JUX1YE0MR.25XH5EZLUCDBM@proton.me>

On Tue Apr 8, 2025 at 1:33 AM JST, Benno Lossin wrote:
> On Sun Apr 6, 2025 at 3:01 PM CEST, Alexandre Courbot wrote:
>> diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
>> index ae9d072741cedbb34bed0be0c20cc75472aa53be..b3c4227e232cca3b5c17930abc63810240b9c4da 100644
>> --- a/rust/kernel/alloc/kvec.rs
>> +++ b/rust/kernel/alloc/kvec.rs
>> @@ -454,30 +454,86 @@ pub fn reserve(&mut self, additional: usize, flags: Flags) -> Result<(), AllocEr
>>      }
>>  }
>>  
>> +impl<T, A: Allocator> Vec<T, A> {
>> +    /// Extends the vector by the elements of `iter`.
>> +    ///
>> +    /// This uses [`Iterator::size_hint`] to optimize memory reallocations, but will work even with
>> +    /// imprecise implementations - albeit in a non-optimal way.
>> +    ///
>> +    /// This method returns an error if a memory reallocation required to accommodate the new items
>> +    /// failed. In this case, callers must assume that some (but not all) elements of `iter` might
>> +    /// have been added to the vector.
>
> I would add that items that haven't been added to the vector remain in
> the iterator. Do note that some iterators drops the items when the
> iterator is dropped. But if not, one can still access them later:
>
>     let vec1 = vec![...];
>     let mut vec2 = KVec::new();
>     let mut iter = vec1.iter();
>     if let Err(_) = vec2.extend(&mut iter) {
>         // can access remaining items:
>         for item in iter {
>             pr_info!("item: {item:?}\n");
>         }
>     }

Indeed, better to mention that, although I guess that if we go with our
own unsafe exact size trait like Danilo suggested we won't ever need to
allocate more than once anyway.

>
>> +    ///
>> +    /// # Note on optimal behavior and correctness
>> +    ///
>> +    /// The efficiency of this method depends on how reliable the [`Iterator::size_hint`]
>> +    /// implementation of the `iter` is.
>> +    ///
>> +    /// It performs optimally with at most a single memory reallocation if the lower bound of
>> +    /// `size_hint` is the exact number of items actually yielded.
>> +    ///
>> +    /// If `size_hint` is more vague, there may be as many memory reallocations as necessary to
>> +    /// cover the whole iterator from the successive lower bounds returned by `size_hint`.
>> +    ///
>> +    /// If `size_hint` signals more items than actually yielded by the iterator, some unused memory
>> +    /// might be reserved.
>> +    ///
>> +    /// Finally, whenever `size_hint` returns `(0, Some(0))`, the method assumes that no more items
>> +    /// are yielded by the iterator and returns. This may result in some items not being added if
>> +    /// there were still some remaining.
>
> Why do this? Can't we just call `next` and see if that is `None` too?
> You could use `Peekable` [1] to call `peek` instead, then the logic
> below shouldn't be too complex.
>
> [1]: https://doc.rust-lang.org/core/iter/trait.Iterator.html#method.peekable

What worried me there was that the compiler might not be able to
optimize the loop away if we always do a last checking `next()` call,
hence this shortcut to avoid an extra loop for the most obvious cases.
But again this should be solved by Danilo's suggestion.

>
>> +    ///
>> +    /// In the kernel most iterators are expected to have a precise and correct `size_hint`
>> +    /// implementation, so this should nicely optimize out for these cases.
>> +    pub fn extend<I>(&mut self, iter: I, flags: Flags) -> Result<(), AllocError>
>> +    where
>> +        I: IntoIterator<Item = T>,
>> +    {
>> +        let mut iter = iter.into_iter();
>> +
>> +        loop {
>> +            let low_bound = match iter.size_hint() {
>> +                // No more items expected, we can return.
>> +                (0, Some(0)) => break,
>> +                // Possibly more items but not certain, tentatively add one.
>> +                (0, _) => 1,
>> +                // More items pending, reserve space for the lower bound.
>> +                (low_bound, _) => low_bound,
>> +            };
>> +
>> +            self.reserve(low_bound, flags)?;
>> +
>> +            // Number of items we actually added.
>> +            let added_items = self
>> +                .spare_capacity_mut()
>> +                .iter_mut()
>> +                // Take a mutable reference to the iterator so we can reuse it in the next
>> +                // iteration of the loop if needed.
>> +                .zip(&mut iter)
>> +                .fold(0, |count, (dst, src)| {
>> +                    dst.write(src);
>> +
>> +                    count + 1
>> +                });
>> +
>> +            // SAFETY:
>> +            // - `self.len() + added_items <= self.capacity()` due to the call to `reserve` above,
>
> In my mind this precondition holds, since
>
>     self.spare_capacity_mut().len() + self.len() == self.capacity()
>
> and
>
>     added_items <= self.spare_capacity_mut().len()
>
> But I guess we haven't written the first one down anywhere.
>
>> +            // - items `[self.len()..self.len() + added_items - 1]` are initialized.
>> +            unsafe { self.set_len(self.len() + added_items) };
>> +
>> +            // `size_hint` was incorrect and our iterator ended before its advertized lower bound.
>> +            if added_items < low_bound {
>> +                break;
>> +            }
>> +        }
>> +
>> +        Ok(())
>> +    }
>> +}
>> +
>>  impl<T: Clone, A: Allocator> Vec<T, A> {
>>      /// Extend the vector by `n` clones of `value`.
>>      pub fn extend_with(&mut self, n: usize, value: T, flags: Flags) -> Result<(), AllocError> {
>> -        if n == 0 {
>> -            return Ok(());
>> -        }
>> -
>> -        self.reserve(n, flags)?;
>> -
>> -        let spare = self.spare_capacity_mut();
>> -
>> -        for item in spare.iter_mut().take(n - 1) {
>> -            item.write(value.clone());
>> -        }
>> -
>> -        // We can write the last element directly without cloning needlessly.
>> -        spare[n - 1].write(value);
>> -
>> -        // SAFETY:
>> -        // - `self.len() + n < self.capacity()` due to the call to reserve above,
>> -        // - the loop and the line above initialized the next `n` elements.
>> -        unsafe { self.set_len(self.len() + n) };
>> -
>> -        Ok(())
>> +        self.extend(core::iter::repeat_n(value, n), flags)
>
> `repeat_n` is not stable in 1.78.0, so you need to add `iter_repeat_n`
> to the allowed features.

Indeed - added the feature, thanks for catching this.

Thanks,
Alex.

  reply	other threads:[~2025-04-08 14:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-07 16:33 [PATCH v3] rust: alloc: implement `extend` for `Vec` Benno Lossin
2025-04-08 14:00 ` Alexandre Courbot [this message]
  -- strict thread matches above, loose matches on Subject: below --
2025-04-06 13:01 Alexandre Courbot
2025-04-07 11:01 ` Danilo Krummrich
2025-04-08 13:34   ` Alexandre Courbot
2025-04-21  8:15     ` Alexandre Courbot
2025-04-22 17:03       ` Danilo Krummrich
2025-04-23  1:02         ` Alexandre Courbot
2025-04-23  8:51           ` Alice Ryhl
2025-04-23  9:40             ` Alexandre Courbot
2025-04-23 16:03               ` Boqun Feng
2025-04-24 11:50                 ` Alice Ryhl
2025-04-24 13:36                   ` Boqun Feng
2025-04-23  9:47           ` Danilo Krummrich
2025-04-23 13:15             ` Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D91B844M3AIC.1HUEPDZZX2O9C@nvidia.com \
    --to=acourbot@nvidia.com \
    --cc=a.hindborg@kernel.org \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dakr@kernel.org \
    --cc=gary@garyguo.net \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tmgross@umich.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox