rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Benno Lossin <benno.lossin@proton.me>
Cc: rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, "Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Wedson Almeida Filho" <wedsonaf@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Andreas Hindborg" <a.hindborg@samsung.com>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Viresh Kumar" <viresh.kumar@linaro.org>,
	"Danilo Krummrich" <dakr@redhat.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	gregkh@linuxfoundation.org
Subject: Re: [RFC PATCH] rust: types: Add explanation for ARef pattern
Date: Thu, 25 Jul 2024 13:43:26 -0700	[thread overview]
Message-ID: <ZqK47uNmutavG9Vh@boqun-archlinux> (raw)
In-Reply-To: <ba35b142-a5ad-454f-8ed8-216bc6cf4d9c@proton.me>

On Thu, Jul 25, 2024 at 08:32:10PM +0000, Benno Lossin wrote:
> On 25.07.24 22:06, Boqun Feng wrote:
> > Hi Benno,
> > 
> > Thanks for taking a look.
> > 
> > On Thu, Jul 25, 2024 at 06:51:56PM +0000, Benno Lossin wrote:
> >> On 10.07.24 05:24, Boqun Feng wrote:
> >>> As the usage of `ARef` and `AlwaysRefCounted` is growing, it makes sense
> >>> to add explanation of the "ARef pattern" to cover the most "DO" and "DO
> >>> NOT" cases when wrapping a self-refcounted C type.
> >>>
> >>> Hence an "ARef pattern" section is added in the documentation of `ARef`.
> >>>
> >>> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> >>> ---
> >>> This is motivated by:
> >>>
> >>> 	https://lore.kernel.org/rust-for-linux/20240705110228.qqhhynbwwuwpcdeo@vireshk-i7/
> >>>
> >>>  rust/kernel/types.rs | 156 +++++++++++++++++++++++++++++++++++++++++++
> >>>  1 file changed, 156 insertions(+)
> >>>
> >>> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
> >>> index bd189d646adb..70fdc780882e 100644
> >>> --- a/rust/kernel/types.rs
> >>> +++ b/rust/kernel/types.rs
> >>> @@ -329,6 +329,162 @@ pub unsafe trait AlwaysRefCounted {
> >>>  ///
> >>>  /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
> >>>  /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
> >>> +///
> >>> +/// # [`ARef`] pattern
> >>> +///
> >>> +/// "[`ARef`] pattern" is preferred when wrapping a C struct which has its own refcounting
> >>
> >> I would have written "[...] struct which is reference-counted, because
> >> [...]", is there a specific reason you wrote "its own"?
> >>
> > 
> > "its own" indicates the reference counters are inside the object (i.e.
> > self refcounted), it's different than `Arc<T>` where the reference
> > counters are "attached" to `T`. Your version looks good to me as well.
> 
> I thought about that as well, but the paragraph above talks about a C
> struct, so what is meant with "its own" there?
> 

Still the same thing, the `refcount_t` (or other reference counter
types) is inside the structure other than outside, one example of an
outside reference-counting is:

	struct foo_ref {
		struct foo *foo;
		refcount_t ref;
	}

	struct foo {
		struct foo_ref *ref;
		...
	}

TBH, I'm not sure whether that case really exist and we care or we want
to use `ARef<Foo>` for that case. So I just put "its own" to avoid that
for now and for the documentation purpose.

> >>> +/// mechanism, because it decouples the operations on the object itself (usually via a `&Foo`) vs the
> >>> +/// operations on a pointer to the object (usually via an `ARef<Foo>`). For example, given a `struct
> >>
> >> Not exactly sure I understand your point here, what exactly is the
> >> advantage of decoupling the operations?
> >> In my mind the following points are the advantages of using `ARef`:
> >> (1) prevents having to implement multiple abstractions for a single C
> >>     object: say there is a `struct foo` that is both used via reference
> >>     counting and by-value on the stack. Without `ARef`, we would have to
> >>     write two abstractions, one for each use-case. With `ARef`, we can
> >>     have one `Foo` that can be wrapped with `ARef` to represent a
> >>     reference-counted object.
> >> (2) `ARef<T>` always represents a reference counted object, so it helps
> >>     with understanding the code. If you read `Foo`, you cannot be sure
> >>     if it is heap or stack allocated.
> >> (3) generalizes common code of reference-counted objects (ie avoiding
> >>     code duplication) and concentration of `unsafe` code.
> >>
> >> In my opinion (1) is the most important, then (2). And (3) is a nice
> >> bonus. If you agree with the list above (maybe you also have additional
> >> advantages of `ARef`?) then it would be great if you could also add them
> >> somewhere here.
> >>
> > 
> > Basically to me, the advantages are mostly (1) and (2) in your list,
> > thank you for the list. And I did try to use an example (below) to
> > explain these, because I felt an example of the bad cases is
> > straightforward.
> > 
> > I will add your list here, because although an example may be
> > straightforward of reading, a list of advantages are better for
> > references. Again, thanks a lot!
> > 
> >>> +/// foo` defined in C, which has its own refcounting operations `get_foo()` and `put_foo()`. Without
> >>> +/// "[`ARef`] pattern", i.e. **bad case**:
> >>
> >> Instead of "bad case" I would have written "i.e. you want to avoid this:".
> >>
> > 
> > I'm OK with your version, but for my personal interest, why? ;-)
> 
> I felt like "bad case" did not "flow" right when reading and I also
> think that "you want to avoid this" sounds more polite :)
> 

Got it, will use "you want to avoid this" then.

> >>> +///
> >>> +/// ```ignore
> >>> +/// pub struct Foo(NonNull<foo>);
> >>> +///
> >>> +/// impl Foo {
> >>> +///     // An operation on the pointer.
> >>> +///     pub unsafe fn from_ptr(ptr: *mut foo) -> Self {
> >>> +///         // Note that whether `get_foo()` is needed here depends on the exact semantics of
> >>> +///         // `from_ptr()`: is it creating a new reference, or it continues using the caller's
> >>> +///         // reference?
> >>> +///         unsafe { get_foo(ptr); }
> >>> +///
> >>> +///         unsafe { Foo(NonNull::new_unchecked(foo)) }
> >>> +///     }
> >>> +///
> >>> +///     // An operation on the object.
> >>> +///     pub fn get_bar(&self) -> Bar {
> >>> +///         unsafe { (*foo.0.as_ptr()).bar }
> >>> +///     }
> >>> +/// }
> >>> +///
> >>> +/// // Plus `impl Clone` and `impl Drop` are also needed to implement manually.
> >>> +/// impl Clone for Foo {
> >>> +///     fn clone(&self) -> Self {
> >>> +///         unsafe { get_foo(self.0.as_ptr()); }
> >>> +///
> >>> +///         Foo(self.0)
> >>> +///     }
> >>> +/// }
> >>> +///
> >>> +/// impl Drop for Foo {
> >>> +///     fn drop(&mut self) {
> >>> +///         unsafe { put_foo(self.0.as_ptr()); }
> >>> +///     }
> >>> +/// }
> >>> +/// ```
> >>> +///
> >>> +/// In this case, it's hard to tell whether `Foo` represent an object of `foo` or a pointer to
> >>> +/// `foo`.
> >>> +///
> >>> +/// However, if using [`ARef`] pattern, `foo` can be wrapped as follow:
> >>> +///
> >>> +/// ```ignore
> >>> +/// /// Note: `Opaque` is needed in most cases since there usually exist C operations on
> >>
> >> I would disagree for the reason that `Opaque` is needed. You need it if
> >> the `foo` eg contains a bool, since C might just write a nonsense
> >> integer which would then result in immediate UB in Rust.
> >> Other reasons might be that certain bytes of `foo` are written to by
> >> other threads, even though on the Rust side we have `&mut Foo` (eg a
> >> `mutex`).
> >>
> > 
> > hmm.. "since there usually exist C operations on ..." include these two
> > cases you mentioned, no? Plus, the reference counters themselves are not
> > marked as atomic at the moment, so without `Opaque`, we also have UB
> > because of the reference counters. I was trying to summarize all these
> > as "C operations on ...", maybe I should say "concurrent C operations on
> > ..."? I am trying to be concise here since it's a comment inside a
> > comment ;-)
> 
> Ah that is your definition of "C operations", I interpreted it as "there
> are functions that take `struct foo *`". So maybe it would be good to
> spell out exactly why `Opaque` might be needed.
> I think its fine to be verbose here.
> 

Will do.

Regards,
Boqun

> ---
> Cheers,
> Benno
> 
> >>> +/// /// `struct foo *`, and `#[repr(transparent)]` is needed for the safety of converting a `*mut
> >>> +/// /// foo` to a `*mut Foo`
> >>> +/// #[repr(transparent)]
> >>> +/// pub struct Foo(Opaque<foo>);
> >>> +///
> >>> +/// impl Foo {
> >>> +///     pub fn get_bar(&self) -> Bar {
> >>> +///         // SAFETY: `self.0.get()` is a valid pointer.
> >>> +///         //
> >>> +///         // Note: Usually extra safety comments are needed here to explain why accessing `.bar`
> >>> +///         // doesn't race with C side. Most cases are either calling a C function, which has its
> >>> +///         // own concurrent access protection, or holding a lock.
> >>> +///         unsafe { (*self.0.get()).bar }
> >>> +///     }
> >>> +/// }
> >>> +/// ```
> >>> +///
> >>> +/// ## Avoid `impl AlwaysRefCounted` if unnecesarry
> 

      reply	other threads:[~2024-07-25 20:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-10  3:24 [RFC PATCH] rust: types: Add explanation for ARef pattern Boqun Feng
2024-07-23  9:14 ` Alice Ryhl
2024-07-24 17:44   ` Boqun Feng
2024-07-25 18:12     ` Benno Lossin
2024-07-25 20:29       ` Boqun Feng
2024-07-26 13:43         ` Benno Lossin
2024-07-26 14:26           ` Boqun Feng
2024-07-26 14:42             ` Benno Lossin
2024-07-26 15:15               ` Boqun Feng
2024-07-26 15:54                 ` Benno Lossin
2024-07-26 16:19                   ` Danilo Krummrich
2024-07-29 11:31                     ` Alice Ryhl
2024-07-31 14:48                       ` Benno Lossin
2024-07-25 18:51 ` Benno Lossin
2024-07-25 20:06   ` Boqun Feng
2024-07-25 20:32     ` Benno Lossin
2024-07-25 20:43       ` Boqun Feng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZqK47uNmutavG9Vh@boqun-archlinux \
    --to=boqun.feng@gmail.com \
    --cc=a.hindborg@samsung.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=corbet@lwn.net \
    --cc=dakr@redhat.com \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tmgross@umich.edu \
    --cc=viresh.kumar@linaro.org \
    --cc=wedsonaf@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).