All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Benno Lossin <benno.lossin@proton.me>
Cc: rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, "Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Wedson Almeida Filho" <wedsonaf@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Andreas Hindborg" <a.hindborg@samsung.com>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Viresh Kumar" <viresh.kumar@linaro.org>,
	"Danilo Krummrich" <dakr@redhat.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	gregkh@linuxfoundation.org
Subject: Re: [RFC PATCH] rust: types: Add explanation for ARef pattern
Date: Thu, 25 Jul 2024 13:43:26 -0700	[thread overview]
Message-ID: <ZqK47uNmutavG9Vh@boqun-archlinux> (raw)
In-Reply-To: <ba35b142-a5ad-454f-8ed8-216bc6cf4d9c@proton.me>

On Thu, Jul 25, 2024 at 08:32:10PM +0000, Benno Lossin wrote:
> On 25.07.24 22:06, Boqun Feng wrote:
> > Hi Benno,
> > 
> > Thanks for taking a look.
> > 
> > On Thu, Jul 25, 2024 at 06:51:56PM +0000, Benno Lossin wrote:
> >> On 10.07.24 05:24, Boqun Feng wrote:
> >>> As the usage of `ARef` and `AlwaysRefCounted` is growing, it makes sense
> >>> to add explanation of the "ARef pattern" to cover the most "DO" and "DO
> >>> NOT" cases when wrapping a self-refcounted C type.
> >>>
> >>> Hence an "ARef pattern" section is added in the documentation of `ARef`.
> >>>
> >>> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> >>> ---
> >>> This is motivated by:
> >>>
> >>> 	https://lore.kernel.org/rust-for-linux/20240705110228.qqhhynbwwuwpcdeo@vireshk-i7/
> >>>
> >>>  rust/kernel/types.rs | 156 +++++++++++++++++++++++++++++++++++++++++++
> >>>  1 file changed, 156 insertions(+)
> >>>
> >>> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
> >>> index bd189d646adb..70fdc780882e 100644
> >>> --- a/rust/kernel/types.rs
> >>> +++ b/rust/kernel/types.rs
> >>> @@ -329,6 +329,162 @@ pub unsafe trait AlwaysRefCounted {
> >>>  ///
> >>>  /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
> >>>  /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
> >>> +///
> >>> +/// # [`ARef`] pattern
> >>> +///
> >>> +/// "[`ARef`] pattern" is preferred when wrapping a C struct which has its own refcounting
> >>
> >> I would have written "[...] struct which is reference-counted, because
> >> [...]", is there a specific reason you wrote "its own"?
> >>
> > 
> > "its own" indicates the reference counters are inside the object (i.e.
> > self refcounted), it's different than `Arc<T>` where the reference
> > counters are "attached" to `T`. Your version looks good to me as well.
> 
> I thought about that as well, but the paragraph above talks about a C
> struct, so what is meant with "its own" there?
> 

Still the same thing, the `refcount_t` (or other reference counter
types) is inside the structure other than outside, one example of an
outside reference-counting is:

	struct foo_ref {
		struct foo *foo;
		refcount_t ref;
	}

	struct foo {
		struct foo_ref *ref;
		...
	}

TBH, I'm not sure whether that case really exist and we care or we want
to use `ARef<Foo>` for that case. So I just put "its own" to avoid that
for now and for the documentation purpose.

> >>> +/// mechanism, because it decouples the operations on the object itself (usually via a `&Foo`) vs the
> >>> +/// operations on a pointer to the object (usually via an `ARef<Foo>`). For example, given a `struct
> >>
> >> Not exactly sure I understand your point here, what exactly is the
> >> advantage of decoupling the operations?
> >> In my mind the following points are the advantages of using `ARef`:
> >> (1) prevents having to implement multiple abstractions for a single C
> >>     object: say there is a `struct foo` that is both used via reference
> >>     counting and by-value on the stack. Without `ARef`, we would have to
> >>     write two abstractions, one for each use-case. With `ARef`, we can
> >>     have one `Foo` that can be wrapped with `ARef` to represent a
> >>     reference-counted object.
> >> (2) `ARef<T>` always represents a reference counted object, so it helps
> >>     with understanding the code. If you read `Foo`, you cannot be sure
> >>     if it is heap or stack allocated.
> >> (3) generalizes common code of reference-counted objects (ie avoiding
> >>     code duplication) and concentration of `unsafe` code.
> >>
> >> In my opinion (1) is the most important, then (2). And (3) is a nice
> >> bonus. If you agree with the list above (maybe you also have additional
> >> advantages of `ARef`?) then it would be great if you could also add them
> >> somewhere here.
> >>
> > 
> > Basically to me, the advantages are mostly (1) and (2) in your list,
> > thank you for the list. And I did try to use an example (below) to
> > explain these, because I felt an example of the bad cases is
> > straightforward.
> > 
> > I will add your list here, because although an example may be
> > straightforward of reading, a list of advantages are better for
> > references. Again, thanks a lot!
> > 
> >>> +/// foo` defined in C, which has its own refcounting operations `get_foo()` and `put_foo()`. Without
> >>> +/// "[`ARef`] pattern", i.e. **bad case**:
> >>
> >> Instead of "bad case" I would have written "i.e. you want to avoid this:".
> >>
> > 
> > I'm OK with your version, but for my personal interest, why? ;-)
> 
> I felt like "bad case" did not "flow" right when reading and I also
> think that "you want to avoid this" sounds more polite :)
> 

Got it, will use "you want to avoid this" then.

> >>> +///
> >>> +/// ```ignore
> >>> +/// pub struct Foo(NonNull<foo>);
> >>> +///
> >>> +/// impl Foo {
> >>> +///     // An operation on the pointer.
> >>> +///     pub unsafe fn from_ptr(ptr: *mut foo) -> Self {
> >>> +///         // Note that whether `get_foo()` is needed here depends on the exact semantics of
> >>> +///         // `from_ptr()`: is it creating a new reference, or it continues using the caller's
> >>> +///         // reference?
> >>> +///         unsafe { get_foo(ptr); }
> >>> +///
> >>> +///         unsafe { Foo(NonNull::new_unchecked(foo)) }
> >>> +///     }
> >>> +///
> >>> +///     // An operation on the object.
> >>> +///     pub fn get_bar(&self) -> Bar {
> >>> +///         unsafe { (*foo.0.as_ptr()).bar }
> >>> +///     }
> >>> +/// }
> >>> +///
> >>> +/// // Plus `impl Clone` and `impl Drop` are also needed to implement manually.
> >>> +/// impl Clone for Foo {
> >>> +///     fn clone(&self) -> Self {
> >>> +///         unsafe { get_foo(self.0.as_ptr()); }
> >>> +///
> >>> +///         Foo(self.0)
> >>> +///     }
> >>> +/// }
> >>> +///
> >>> +/// impl Drop for Foo {
> >>> +///     fn drop(&mut self) {
> >>> +///         unsafe { put_foo(self.0.as_ptr()); }
> >>> +///     }
> >>> +/// }
> >>> +/// ```
> >>> +///
> >>> +/// In this case, it's hard to tell whether `Foo` represent an object of `foo` or a pointer to
> >>> +/// `foo`.
> >>> +///
> >>> +/// However, if using [`ARef`] pattern, `foo` can be wrapped as follow:
> >>> +///
> >>> +/// ```ignore
> >>> +/// /// Note: `Opaque` is needed in most cases since there usually exist C operations on
> >>
> >> I would disagree for the reason that `Opaque` is needed. You need it if
> >> the `foo` eg contains a bool, since C might just write a nonsense
> >> integer which would then result in immediate UB in Rust.
> >> Other reasons might be that certain bytes of `foo` are written to by
> >> other threads, even though on the Rust side we have `&mut Foo` (eg a
> >> `mutex`).
> >>
> > 
> > hmm.. "since there usually exist C operations on ..." include these two
> > cases you mentioned, no? Plus, the reference counters themselves are not
> > marked as atomic at the moment, so without `Opaque`, we also have UB
> > because of the reference counters. I was trying to summarize all these
> > as "C operations on ...", maybe I should say "concurrent C operations on
> > ..."? I am trying to be concise here since it's a comment inside a
> > comment ;-)
> 
> Ah that is your definition of "C operations", I interpreted it as "there
> are functions that take `struct foo *`". So maybe it would be good to
> spell out exactly why `Opaque` might be needed.
> I think its fine to be verbose here.
> 

Will do.

Regards,
Boqun

> ---
> Cheers,
> Benno
> 
> >>> +/// /// `struct foo *`, and `#[repr(transparent)]` is needed for the safety of converting a `*mut
> >>> +/// /// foo` to a `*mut Foo`
> >>> +/// #[repr(transparent)]
> >>> +/// pub struct Foo(Opaque<foo>);
> >>> +///
> >>> +/// impl Foo {
> >>> +///     pub fn get_bar(&self) -> Bar {
> >>> +///         // SAFETY: `self.0.get()` is a valid pointer.
> >>> +///         //
> >>> +///         // Note: Usually extra safety comments are needed here to explain why accessing `.bar`
> >>> +///         // doesn't race with C side. Most cases are either calling a C function, which has its
> >>> +///         // own concurrent access protection, or holding a lock.
> >>> +///         unsafe { (*self.0.get()).bar }
> >>> +///     }
> >>> +/// }
> >>> +/// ```
> >>> +///
> >>> +/// ## Avoid `impl AlwaysRefCounted` if unnecesarry
> 

      reply	other threads:[~2024-07-25 20:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-10  3:24 [RFC PATCH] rust: types: Add explanation for ARef pattern Boqun Feng
2024-07-23  9:14 ` Alice Ryhl
2024-07-24 17:44   ` Boqun Feng
2024-07-25 18:12     ` Benno Lossin
2024-07-25 20:29       ` Boqun Feng
2024-07-26 13:43         ` Benno Lossin
2024-07-26 14:26           ` Boqun Feng
2024-07-26 14:42             ` Benno Lossin
2024-07-26 15:15               ` Boqun Feng
2024-07-26 15:54                 ` Benno Lossin
2024-07-26 16:19                   ` Danilo Krummrich
2024-07-29 11:31                     ` Alice Ryhl
2024-07-31 14:48                       ` Benno Lossin
2024-07-25 18:51 ` Benno Lossin
2024-07-25 20:06   ` Boqun Feng
2024-07-25 20:32     ` Benno Lossin
2024-07-25 20:43       ` Boqun Feng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZqK47uNmutavG9Vh@boqun-archlinux \
    --to=boqun.feng@gmail.com \
    --cc=a.hindborg@samsung.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=corbet@lwn.net \
    --cc=dakr@redhat.com \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tmgross@umich.edu \
    --cc=viresh.kumar@linaro.org \
    --cc=wedsonaf@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.